Nucleotide and amino acid sequences relating to respiratory diseases and obesity

ABSTRACT

This invention relates to genes identified from human chromosome 12q23-qter, which are associated with various diseases, including asthma. The invention also relates to the nucleotide sequences of these genes, isolated nucleic acids comprising these nucleotide sequences, and isolated polypeptides or peptides encoded thereby. The invention further relates to vectors and host cells comprising the disclosed nucleotide sequences, or fragments thereof, as well as antibodies that bind to the encoded polypeptides or peptides. Also related are ligands that modulate the activity of the disclosed genes or gene products. In addition, the invention relates to methods and compositions employing the disclosed nucleic acids, polypeptides or peptides, antibodies, and/or ligands for use in diagnostics and therapeutics for asthma and other diseases.

RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.13/341,457, filed Dec. 20, 2011 (now allowed), which is a divisional ofU.S. application Ser. No. 11/690,650, filed Mar. 23, 2007 (now U.S. Pat.No. 8,105,826), which is a divisional of U.S. application Ser. No.10/021,698, filed Oct. 22, 2001 (now U.S. Pat. No. 7,205,146), which isa continuation of U.S. Ser. No. 09/881,797, filed Jun. 14, 2001(abandoned), and claims the benefit of provisional application U.S. Ser.No. 60/211,749, filed Jun. 14, 2000 (expired), all of which areincorporated by reference in their entirety.

FIELD OF THE INVENTION

This invention relates to genes identified from human chromosome12q23-qter, including Gene 454, Gene 561, and Gene 757, which areassociated with asthma, obesity, inflammatory bowel disease, and otherhuman diseases. The invention also relates to the nucleotide sequencesof these genes, including genomic DNA sequences, cDNA sequences, andsingle nucleotide polymorphisms. The invention further relates toisolated nucleic acids comprising these nucleotide sequences, andisolated polypeptides or peptides encoded thereby. Also related areexpression vectors and host cells comprising the disclosed nucleic acidsor fragments thereof, as well as antibodies that bind to the encodedpolypeptides or peptides. The present invention further relates toligands that modulate the activity of the disclosed genes or geneproducts. In addition, the invention relates to diagnostics andtherapeutics for various diseases, including asthma, utilizing thedisclosed nucleic acids, polypeptides or peptides, antibodies, and/orligands.

BRIEF DESCRIPTION OF THE SEQUENCE LISTING

Incorporated herein by reference in its entirety is a Sequence Listing,comprising SEQ ID NO:1 to SEQ ID NO:4687 and a substitute sequencelisting filed on Mar. 5, 2004. The Sequence Listing filed on Oct. 22,2001 is contained on a CD-ROM, three copies of which are filed, theSequence Listing being in a computer-readable ASCII file named“Seqlist.txt”, created on Jun. 7, 2001 and of 11,976 kilobyte in size,in IBM-PC Windows®NT v4.0 format.

BACKGROUND

Asthma has been linked to markers on human chromosome 12 (Wilson et al.,1998, Genomics, 53: 251-259). In addition, obesity has been linked toasthma (Wilson et al., 1999, Arch. Intern. Med. 159: 2513-14). Inparticular, chromosomal region 12q23-qter has been associated with avariety of genetic disorders, including male germ cell tumors,histidinemia, growth retardation with deafness and mental retardation,deficiency of Acyl-CoA dehydrogenase, spinal muscular atrophy, Darierdisease, cardiomyopathy, Spinocerebellar ataxia-2, brachydactyly,Mevalonicaciduria, Hyperimmunoglobulinemia D, Noonan syndrome-1,Cardiofaciocutaneous syndrome, spinal muscular atrophy-4, tyrosinemia,phenylketonuria, B-cell non-Hodgkin lymphoma, Ulnar-mammary syndrome,Holt-Oram syndrome, Scapuloperoneal spinal muscular atrophy, alcoholintolerance, MODY, Diabetes mellitus, noninsulin-dependent 2, anddiabetes mellitus insulin-dependent (See National Center forBiotechnology Information; Bethesda, Md.). The genes of this regions arealso associated with obesity, lung disease, particularly, inflammatorylung disease phenotypes such as Chronic Obstructive Lung Disease (COPD),Adult Respiratory Distress Syndrome (ARDS), and asthma. However, fewgenes in chromosomal region 12q23-qter have been discovered. Thus, thereis a need in the art for the identification of specific genes that areinvolved in these disorders. Identification and characterization of suchgenes will allow the development of effective diagnostics andtherapeutic means to diagnose, prevent, and/or treat lung relateddisorders, as well as the other diseases described herein.

SUMMARY OF THE INVENTION

This invention relates to isolated DNA comprising genes located onchromosome 12q23-qter (see Table 4). In specific embodiments, theinvention relates to isolated nucleic acids comprising 12q23-qtergenomic sequences (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO: 156to SEQ ID NO: 4973), cDNA and EST sequences (e.g., SEQ ID NO:1 to SEQ IDNO:92), BAC sequences (e.g., SEQ ID NO:156 to SEQ ID NO:693), BAC clonesand contigs (e.g., SEQ ID NO: 694 to SEQ ID NO: 1265), direct selectedsequences (e.g., SEQ ID NO: 1266 to SEQ ID NO: 2052), clusters (e.g.,SEQ ID NO: 2053 to SEQ ID NO: 4973), complementary sequences, sequencevariants, or fragments thereof, as described herein. The presentinvention also encompasses nucleic acid probes or primers useful forassaying a biological sample for the presence or expression of12q23-qter genes.

The invention further encompasses nucleic acids variants comprisingsingle nucleotide polymorphisms (SNPs) identified in several 12q23-qtergenes (Table 10; FIGS. 7A-7H; FIGS. 9A-9F; FIGS. 27A-27K; and FIGS.28A-28C). These include SNPs for gene 454 (SEQ ID NO: 19; FIGS. 7A-7H),gene 561.1 (SEQ ID NO: 31; FIGS. 27A-27K), gene 561.2 (SEQ ID NO: 32;FIGS. 28A-28C), and gene 757 (SEQ ID NO: 90; FIGS. 9A-9F). SNPs can beused to diagnose diseases such as asthma, or to determine a geneticpredisposition thereto. In addition, the present invention encompassesnucleic acids comprising alternate splicing variants—(e.g., SEQ ID NO:1to SEQ ID NO:5; SEQ ID NO:17 to SEQ ID NO:18; SEQ ID NO:36 to SEQ IDNO:37; SEQ ID NO:43 to SEQ ID NO:44; and SEQ ID NO:80 to SEQ ID NO:81).

This invention also relates to vectors and host cells comprising vectorscomprising the 12q23-qter nucleic acid sequences disclosed herein. Suchvectors can be used for nucleic acid preparations, including antisensenucleic acids, and for the expression of encoded polypeptides orpeptides. Host cells can be prokaryotic or eukaryotic cells. In specificembodiments, an expression vector comprises a DNA sequence encoding the12q23-qter polypeptide sequence (e.g., SEQ ID NO:93 to SEQ ID NO:155),sequence variants, or fragments thereof, as described herein.

The present invention further relates to isolated 12q23-qterpolypeptides and peptides. In specific embodiments, the polypeptides orpeptides comprise the amino acid sequences encoded by the 12q23-qtergenes (e.g., SEQ ID NO:93 to SEQ ID NO:155), sequence variants, orportions thereof, as described herein. In addition, this inventionencompasses isolated fusion proteins comprising 12q23-qter polypeptidesor peptides.

The present invention also relates to isolated antibodies, includingmonoclonal and polyclonal antibodies, and antibody fragments, that arespecifically reactive with the 12q23-qter polypeptides, fusion proteins,or variants, or portions thereof, as disclosed herein. In specificembodiments, monoclonal antibodies are prepared to be specificallyreactive with a 12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ IDNO:155) or peptides, or sequence variants thereof.

In addition, the present invention relates to methods of obtaining12q23-qter polynucleotides and polypeptides, variant sequences, orfragments thereof, as disclosed herein. Also related are methods ofobtaining antibodies and antibody fragments that bind to 12q23-qterpolypeptides, variant sequences, or fragments thereof. The presentinvention also encompasses methods of obtaining 12q23-qter ligands,e.g., agonists, antagonists, inhibitors, and binding factors. Suchligands can be used as therapeutics for asthma and related diseases.

The present invention also relates to diagnostic methods and kitsutilizing obtaining 12q23-qter (wild-type, mutant, or variant) nucleicacids, polypeptides, antibodies, or functional fragments thereof. Suchfactors can be used, for example, in diagnostic methods and kits formeasuring expression levels of obtaining 12q23-qter gene expression, andto screen for various obtaining 12q23-qter-related diseases, especiallyasthma. In addition, the nucleic acids described herein can be used toidentify chromosomal abnormalities affecting 12q23-qter genes, and toidentify allelic variants or mutations of 12q23-qte genes in anindividual or population.

The present invention further relates to methods and therapeutics forthe treatment of various diseases, including asthma. In variousembodiments, therapeutics comprising the disclosed 12q23-qter nucleicacids, polypeptides, antibodies, ligands, or variants, derivatives, orportions thereof, are administered to a subject to treat, prevent, orameliorate asthma. Specifically related are therapeutics comprising12q23-qter antisense nucleic acids, monoclonal antibodies, and genetherapy vectors. Such therapeutics can be administered alone, or incombination with one or more asthma treatments.

In addition, this invention relates to non-human transgenic animals andcell lines comprising one or more of the disclosed 12q23-qter nucleicacids, which can be used for drug screening, protein production, andother purposes. Also related are non-human knock-out animals and celllines, wherein one or more endogenous 12q23-qter genes (i.e.,orthologs), or portions thereof, are deleted or replaced by markergenes.

This invention further relates to methods of identifying proteins thatare candidates for being involved in asthma (i.e., a “candidateprotein”). Such proteins are identified by a method comprising: 1)identifying a protein in a first individual having the asthma phenotype;2) identifying a protein in a second individual not having the asthmaphenotype; and 3) comparing the protein of the first individual to theprotein of the second individual, wherein a) the protein that is presentin the second individual but not the first individual is the candidateprotein; or b) the protein that is present in a higher amount in thesecond individual than in the first individual is the candidate protein;or c) the protein that is present in a lower amount in the secondindividual than in the first individual is the candidate protein.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-1D show the plot of multipoint LOD score against the maplocation of the markers along chromosome 12 for four phenotypes: asthma,bronchial hyper-responsiveness, total IgE, and specific IgE.

FIGS. 2A-2P show genes mapped to the 12q23-qter interval determined frominformation that is curated by the National Center for BiotechnologyInformation, “NCBI”; Bethesda, Md. This particular information containsgenes mapped against the Gene Bridge (GB) 4 panel.

FIGS. 3A-3G show genes mapped to the 12q23-qter interval determined frominformation that is curated by NCBI (Bethesda, Md.). This particularinformation contains genes mapped against the Gene Bridge (GB) 3 panel.

FIG. 4 shows the integration of the Marshfield Center for MedicalGenetics (Marshfield, Mich.) genetic map with GeneMap99 from NCBI. Theregions of study mentioned above are indicated at the top of the figure.

FIGS. 5A-5I show the BAC/STS content contig map for chromosome 12.

FIGS. 6A-6U show the results of Northern blot analysis of the Genes of12q23-qter in various tissues.

FIGS. 7A-7H show the cDNA sequence (SEQ ID NO: 19) and amino acidsequence (SEQ ID NO: 111) of Gene 454 with the corresponding SNPsunderlined.

FIG. 8 shows the results of RT-PCR analysis of Gene 561.1 and Gene561.2.

FIGS. 9A-9F show the cDNA sequence (SEQ ID NO: 90) and amino acidsequence (SEQ ID NO: 153) of Gene 757 with the corresponding SNPsunderlined.

FIG. 10 shows the domain structure of Gene 454 and the exon location ofthe corresponding SNPs.

FIG. 11 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (asthma) and controls in the combinedpopulation against the relative location (Kb) of SNPs along chromosome12.

FIG. 12 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (asthma) and controls in the US and UKpopulations against the relative location (Kb) of SNPs along chromosome12.

FIG. 13 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (BHR (PC₂₀≦16 mg/ml) and asthma) andcontrols in the combined population against the relative location (Kb)of SNPs along chromosome 12.

FIG. 14 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (BHR (PC₂₀≦16 mg/ml) and asthma) andcontrols in the US and UK populations against the relative location (Kb)of SNPs along chromosome 12.

FIG. 15 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (total IgE and asthma) and controls inthe combined population against the relative location (Kb) of SNPs alongchromosome 12.

FIG. 16 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (total IgE and asthma) and controls inthe US and UK populations against the relative location (Kb) of SNPsalong chromosome 12.

FIG. 17 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (specific IgE and asthma) and controlsin the combined population against the relative location (Kb) of SNPsalong chromosome 12.

FIG. 18 shows the significance (−log₁₀(p-value)) for the comparison ofSNP allele frequencies in cases (specific IgE and asthma) and controlsin the US and UK populations against the relative location (Kb) of SNPsalong chromosome 12.

FIG. 19 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (asthma) and controlsin the combined population against the relative location (Kb) of SNPsalong chromosome 12.

FIG. 20 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (asthma) and controlsin the US and UK populations against the relative location (Kb) of SNPsalong chromosome 12.

FIG. 21 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (BHR (PC₂₀≦16 mg/ml)and asthma) and controls in the combined population against the relativelocation (Kb) of SNPs along chromosome 12.

FIG. 22 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (BHR (PC₂₀≦16 mg/ml)and asthma) and controls in the US and UK populations against therelative location (Kb) of SNPs along chromosome 12.

FIG. 23 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (total IgE and asthma)and controls in the combined population against the relative location(Kb) of SNPs along chromosome 12.

FIG. 24 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (total IgE and asthma)and controls in the US and UK populations against the relative location(Kb) of SNPs along chromosome 12.

FIG. 25 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (specific IgE andasthma) and controls in the combined population against the relativelocation (Kb) of SNPs along chromosome 12.

FIG. 26 shows the significance (−log₁₀(p-value)) for the comparison ofhaplotype frequencies (2-SNP-at-a-time) in cases (specific IgE andasthma) and controls in the US and UK populations against the relativelocation (Kb) of SNPs along chromosome 12.

FIGS. 27A-27K show the cDNA sequence (SEQ ID NO: 30) and amino acidsequence (SEQ ID NO: 120) of Gene 561.1 with the corresponding SNPsunderlined.

FIGS. 28A-28C show the cDNA sequence (SEQ ID NO: 32) and amino acidsequence (SEQ ID NO: 121) of Gene 561.2 with the corresponding SNPsunderlined.

DETAILED DESCRIPTION OF THE INVENTION

Chromosome 12q23-qter genes were isolated by narrowly defining theregion of chromosome 12q23-qter that showed association with asthma.Chromosome 12q23-qter genes have been implicated in other diseases,including obesity. Bronchial asthma, furthermore, has been linked tointestinal conditions such as inflammatory bowel disease (B. Wallaert etal., 1995, J. Exp. Med. 182:1897-1904). Thus, there was a need toidentify and isolate the gene(s) associated with this region of humanchromosome 12.

To aid in the understanding of the specification and claims, thefollowing definitions are provided.

DEFINITIONS

“Disorder region” refers to a portion of the human chromosome 12 boundedby the markers D12S2070 to the 12q telomere. A “disorder-associated”nucleic acid or “disorder-associated” polypeptide sequence refers to anucleic acid sequence that maps to region 12q23-qter and polypeptidesencoded thereby. For nucleic acid sequences, this encompasses sequencesthat are homologous or complementary to the reference sequence, as wellas “sequence-conservative variants” and “function-conservativevariants.” For polypeptide sequences, this encompasses“function-conservative variants.” Also encompassed arenaturally-occurring mutations associated with respiratory diseasesincluding, but not limited to, asthma and atopy, as well as otherdiseases arising from mutations in this region including those describedin detail herein. These mutations are not limited to mutations thatcause inappropriate expression (e.g., lack of expression,over-expression, and expression in an inappropriate tissue type).

“Sequence-conservative” variants are those in which a change of one ormore nucleotides in a given codon position results in no alteration inthe amino acid encoded at that position (i.e., silent mutations).“Function-conservative” variants are those in which a change in one ormore nucleotides in a given codon position results in a polypeptidesequence in which a given amino acid residue in a polypeptide has beenchanged without substantially altering the overall conformation andfunction of the native polypeptide, including, but not limited to,replacement of an amino acid with one having similar physico-chemicalproperties (such as, for example, acidic, basic, hydrophobic, and thelike). “Function-conservative” variants also include analogs of a givenpolypeptide and any polypeptides that have the ability to elicitantibodies specific to a designated polypeptide.

“Nucleic acid” or “polynucleotide” as used herein refers to purine- andpyrimidine-containing polymers of any length, either polyribonucleotidesor polydeoxyribonucleotide or mixed polyribo-polydeoxyribonucleotides.This includes single- and double-stranded molecules, i.e., DNA-DNA,DNA-RNA and RNA-RNA hybrids, as well as “protein nucleic acids” (PNA)formed by conjugating bases to an amino acid backbone. This alsoincludes nucleic acids containing modified bases.

A “coding sequence” or a “protein-coding sequence” is a polynucleotidesequence capable of being transcribed into mRNA and/or capable of beingtranslated into a polypeptide. The boundaries of the coding sequence aretypically determined by a translation start codon at the 5′-terminus anda translation stop codon at the 3′-terminus.

As used herein, the “reference sequence” refers to the sequence used tocompare individuals in identifying single nucleotide polymorphisms andthe like. “Variant” sequences refer to nucleotide sequences (and in somecases, the encoded amino acid sequences) that differ from the referencesequence(s) at one or more positions. Non-limiting examples of variantsequences include the disclosed single nucleotide polymorphisms (SNPs),alternate splice variants, and the amino acid sequences encoded by thesevariants.

“Expressed Sequence Tag (EST)” is a nucleic acid that encodes for aportion of or a full-length protein sequence.

“12q23-qter genes” and “12q23-qter nucleic acids” include the genes andEST's shown in FIGS. 2A-2P and FIGS. 3A-3G, as well as the sequenceslisted in Table 4 (i.e., Gene 214, Gene 215, Gene 224, Gene 266, Gene283, Gene 292, Gene 298, Gene 321, Gene 399, Gene 422, Gene 436, Gene454, Gene 515, Gene 536, Gene 543, Gene 548, Gene 549, Gene 550, Gene551, Gene 553, Gene 555, Gene 558, Gene 559, Gene 561, Gene 562, Gene563, Gene 564, Gene 566, Gene 567, Gene 570, Gene 571, Gene 572, Gene575, Gene 577, Gene 579, Gene 580, Gene 581, Gene 583, Gene 584, Gene586, Gene 587, Gene 589, Gene 590, Gene 592, Gene 593, Gene 594, Gene595, Gene 596, Gene 601, Gene 603, Gene 604, Gene 605, Gene 606, Gene608, Gene 611, Gene 615, Gene 617, Gene 618, Gene 620, Gene 621, Gene622, Gene 690, Gene 692, Gene 693, Gene 694, Gene 695, Gene 697, Gene698, Gene 699, Gene 702, Gene 705, Gene 707, Gene 722, Gene 748, Gene749, Gene 751, Gene 752, Gene 753, Gene 754, Gene 756, Gene 757, Gene835, and Gene 848).

“12q23q-qter proteins” and “12q23q-qter polypeptides” include thepolypeptide sequences encoded by the genes listed in Table 4.

A “complement” of a nucleic acid sequence as used herein refers to the“antisense” sequence that participates in Watson-Crick base-pairing withthe original sequence.

A “probe” refers to a nucleic acid or oligonucleotide that forms ahybrid structure with a sequence in a target region due tocomplementarily of at least one sequence in the probe with a sequence inthe target region.

Nucleic acids are “hybridizable” to each other when at least one strandof nucleic acid can anneal to another nucleic acid strand under definedstringency conditions. As is well known in the art, stringency ofhybridization is determined, e.g., by (a) the temperature at whichhybridization and/or washing is performed, and (b) the ionic strengthand polarity (e.g., formamide) of the hybridization and washingsolutions, as well as other parameters. Hybridization requires that thetwo nucleic acids contain substantially complementary sequences;depending on the stringency of hybridization, however, mismatches may betolerated. The appropriate stringency for hybridizing nucleic acidsdepends on the length of the nucleic acids and the degree ofcomplementarily, variables well known in the art.

“Gene” refers to a DNA sequence that encodes through its template ormessenger RNA a sequence of amino acids characteristic of a specificpeptide, polypeptide, or protein. The term “gene” as used herein withreference to genomic DNA includes intervening, non-coding regions, aswell as regulatory regions, and can include 5′ and 3′ ends.

“Gene sequence” refers to a DNA molecule, including a DNA molecule thatcontains a non-transcribed or non-translated sequence. The term is alsointended to include any combination of gene(s), gene fragment(s),non-transcribed sequence(s), or non-translated sequence(s) that arepresent on the same DNA molecule.

A gene sequence is “wild-type” if such sequence is usually found inindividuals unaffected by the disease or condition of interest. However,environmental factors and other genes can also play an important role inthe ultimate determination of the disease. In the context of complexdiseases involving multiple genes (“oligogenic disease”), the “wildtype”, or normal sequence can also be associated with a measurable riskor susceptibility, receiving its reference status based on its frequencyin the general population. As used herein, “wild-type” refers to thereference sequence. The wild-type sequences are used to identify thevariants (single nucleotide polymorphisms) described in detail herein.

A gene sequence is a “mutant” sequence if it differs from the wild-typesequence. For example, a Gene 454 nucleic acid containing a singlenucleotide polymorphism is a mutant sequence. In some cases, theindividual carrying such gene has increased susceptibility toward thedisease or condition of interest. In other cases, the “mutant” sequencemight also refer to a sequence that decreases the susceptibility towarda disease or condition of interest, and thus acting in a protectivemanner. Also a gene is a “mutant” gene if too much (“overexpressed”) ortoo little (“underexpressed”) of such gene is expressed in the tissuesin which such gene is normally expressed, thereby causing the disease orcondition of interest.

“cDNA” refers to complementary or copy DNA produced from an RNA templateby the action of RNA-dependent DNA polymerase (reverse transcriptase).Thus, a “cDNA clone” means a duplex DNA sequence complementary to an RNAmolecule of interest, carried in a cloning vector or PCR amplified. Thisterm includes genes from which the intervening sequences have beenremoved.

“Recombinant DNA” means a molecule that has been recombined by in vitrosplicing/and includes cDNA or a genomic DNA sequence.

“Cloning” refers to the use of in vitro recombination techniques toinsert a particular gene or other DNA sequence into a vector molecule.In order to successfully clone a desired gene, it is necessary to usemethods for generating DNA fragments, for joining the fragments tovector molecules, for introducing the composite DNA molecule into a hostcell in which it can replicate, and for selecting the clone having thetarget gene from amongst the recipient host cells.

“cDNA library” refers to a collection of recombinant DNA moleculescontaining cDNA inserts, which together comprise the entire genome of anorganism. Such a cDNA library can be prepared by methods known to oneskilled in the art and described by, for example, Cowell and Austin,1997, “cDNA Library Protocols,” Methods in Molecular Biology. Generally,RNA is first isolated from the cells of an organism from whose genome itis desired to clone a particular gene.

The term “vector” as used herein refers to a nucleic acid moleculecapable of replicating another nucleic acid to which it has been linked.A vector, for example, can be a plasmid.

“Cloning vector” refers to a plasmid or phage DNA or other DNA sequencethat is able to replicate in a host cell. The cloning vector ischaracterized by one or more endonuclease recognition sites at whichsuch DNA sequences may be cut in a determinable fashion without loss ofan essential biological function of the DNA, which may contain a markersuitable for use in the identification of transformed cells.

“Expression vector” refers to a vehicle or vector similar to a cloningvector but which is capable of expressing a nucleic acid sequence thathas been cloned into it, after transformation into a host. A nucleicacid sequence is “expressed” when it is transcribed to yield an mRNAsequence. In most cases, this transcript will be translated to yieldamino acid sequence. The cloned gene is usually placed under the controlof (i.e., operably linked to) an expression control sequence.

“Expression control sequence” or “regulatory sequence” refers to anucleotide sequence that controls or regulates expression of structuralgenes when operably linked to those genes. These include, for example,the lac systems, the trp system, major operator and promoter regions ofthe phage lambda, the control region of fd coat protein and othersequences known to control the expression of genes in prokaryotic oreukaryotic cells. Expression control sequences will vary depending onwhether the vector is designed to express the operably linked gene in aprokaryotic or eukaryotic host, and may contain transcriptional elementssuch as enhancer elements, termination sequences, tissue-specificityelements and/or translational initiation and termination sites.

“Operably linked” means that the promoter controls the initiation ofexpression of the gene. A promoter is operably linked to a sequence ofproximal DNA if upon introduction into a host cell the promoterdetermines the transcription of the proximal DNA sequence(s) into one ormore species of RNA. A promoter is operably linked to a DNA sequence ifthe promoter is capable of initiating transcription of that DNAsequence.

“Host” includes prokaryotes and eukaryotes. The term includes anorganism or cell that is the recipient of a replicable expressionvector.

The introduction of the nucleic acids into the host cell by any methodknown in the art, including those described herein, will be referred toherein as “transformation.” The cells into which have been introducednucleic acids described above are meant to also include the progeny ofsuch cells.

“Amplification of nucleic acids” refers to methods such as polymerasechain reaction (PCR), ligation amplification (or ligase chain reaction,LCR) and amplification methods based on the use of Q-beta replicase.These methods are well known in the art and described, for example, inU.S. Pat. Nos. 4,683,195 and 4,683,202. Reagents and hardware forconducting PCR are commercially available. Primers useful for amplifyingsequences from the disorder region are preferably complementary to, andpreferably hybridize specifically to, sequences in the 12q23-qter regionor in regions that flank a target region therein. Chromosome 12q23-qtergenes generated by amplification may be sequenced directly.Alternatively, the amplified sequence(s) may be cloned prior to sequenceanalysis.

A nucleic acid or fragment thereof is “substantially homologous” or“substantially similar” to another if, when optimally aligned (withappropriate nucleotide insertions and/or deletions) with the othernucleic acid (or its complementary strand), there is nucleotide sequenceidentity in at least 60% of the nucleotide bases, usually at least 70%,more usually at least 80%, preferably at least 90%, and more preferablyat least 95-98% of the nucleotide bases.

Alternatively, substantial homology or similarity exists when a nucleicacid or fragment thereof will hybridize, under selective hybridizationconditions, to another nucleic acid (or a complementary strand thereof).Selectivity of hybridization exists when hybridization which issubstantially more selective than total lack of specificity occurs.Typically, selective hybridization will occur when there is at least 55%homology over a stretch of at least nine or more nucleotides, preferablyat least 65%, more preferably at least 75%, and most preferably at least90% (see, M. Kanehisa, 1984, Nucl. Acids Res. 11:203-213). The length ofhomology comparison, as described, may be over longer stretches, and incertain embodiments will often be over a stretch of at least 14nucleotides, usually at least 20 nucleotides, more usually at least 24nucleotides, typically at least 28 nucleotides, more typically at least32 nucleotides, and preferably at least 36 or more nucleotides.

Nucleic acids referred to herein as “isolated” are nucleic acidsseparated away from the nucleic acids of the genomic DNA or cellular RNAof their source of origin (e.g., as it exists in cells or in a mixtureof nucleic acids such as a library), and may have undergone furtherprocessing. “Isolated”, as used herein, refers to nucleic or amino acidsequences that are at least 60% free, preferably 75% free, and mostpreferably 90% free from other components with which they are naturallyassociated. “Isolated” nucleic acids (polynucleotides) include nucleicacids obtained by methods described herein, similar methods or othersuitable methods, including essentially pure nucleic acids, nucleicacids produced by chemical synthesis, by combinations of biological andchemical methods, and recombinant nucleic acids which are isolated.Nucleic acids referred to herein as “recombinant” are nucleic acidswhich have been produced by recombinant DNA methodology, including thosenucleic acids that are generated by procedures which rely upon a methodof artificial replication, such as the polymerase chain reaction (PCR)and/or cloning into a vector using restriction enzymes. “Recombinant”nucleic acids are also those that result from recombination events thatoccur through the natural mechanisms of cells, but are selected forafter the introduction to the cells of nucleic acids designed to allowor make probable a desired recombination event. Portions of the isolatednucleic acids which code for polypeptides having a certain function canbe identified and isolated by, for example, the method of Jasin, M., etal., U.S. Pat. No. 4,952,501.

In the context of this invention, the term “oligonucleotide” refers tonaturally-occurring species or synthetic species formed fromnaturally-occurring subunits or their close homologs. The term may alsorefer to moieties that function similarly to oligonucleotides, but havenon-naturally-occurring portions. Thus, oligonucleotides may havealtered sugar moieties or inter-sugar linkages. Exemplary among theseare phosphorothioate and other sulfur containing species which are knownin the art.

As used herein, the terms “protein” and “polypeptide” are synonymous.“Peptides” are defined as fragments or portions of polypeptides,preferably fragments or portions having at least one functional activity(e.g., proteolysis, adhesion, fusion, antigenic, or intracellularactivity) as the complete polypeptide sequence.

As used herein, “isolated” proteins or polypeptides are proteins orpolypeptides purified to a state beyond that in which they exist incells. In a preferred embodiment, they are at least 10% pure; i.e., mostpreferably they are substantially purified to 80 or 90% purity.“Isolated” proteins or polypeptides include proteins or polypeptidesobtained by methods described infra, similar methods or other suitablemethods, and include essentially pure proteins or polypeptides, proteinsor polypeptides produced by chemical synthesis or by combinations ofbiological and chemical methods, and recombinant proteins orpolypeptides which are isolated. Proteins or polypeptides referred toherein as “recombinant” are proteins or polypeptides produced by theexpression of recombinant nucleic acids.

A “portion” as used herein with regard to a protein or polypeptide,refers to fragments of that protein or polypeptide. The fragments canrange in size from 5 amino acid residues to all but one residue of theentire protein sequence. Thus, a portion or fragment can be at least 5,5-50, 50-100, 100-200, 200-400, 400-800, or more consecutive amino acidresidues of a chromosome 12q23-qter protein or polypeptide, for example,SEQ ID NO:93 to SEQ ID NO:155, or variants thereof.

The term “immunogenic”, refers to the ability of a molecule (e.g., apolypeptide or peptide) to elicit a humoral and/or cellular immuneresponse in a host animal.

The term “antigenic” refers to the ability of a molecule (e.g., apolypeptide or peptide) to bind to its specific antibody withsufficiently high affinity to form a detectable antigen-antibodycomplex.

“Antibodies” refer to polyclonal and/or monoclonal antibodies andfragments thereof, and immunologic binding equivalents thereof, that canbind to asthma proteins and fragments thereof or to nucleic acidsequences from the 12q23-qter region, particularly from the asthma locusor a portion thereof. The term antibody is used both to refer to ahomogeneous molecular entity, or a mixture such as a serum product madeup of a plurality of different molecular entities.

The term “monoclonal antibody” or “monoclonal antibody composition”, asused herein, refers to a population of antibody molecules that containonly one species of an antigen binding site capable of immunoreactingwith a particular epitope of a 12q23-qter polypeptide or peptide. Amonoclonal antibody composition thus typically displays a single bindingaffinity for a particular 12q23-qter polypeptide or peptide with whichit immunoreacts.

The term “ligand” as used herein describes any molecule, protein,peptide, or compound with the capability of directly or indirectlyaltering the physiological function, stability, or levels of apolypeptide.

A “sample” as used herein refers to a biological sample, such as, forexample, tissue or fluid isolated from an individual (including, withoutlimitation, plasma, serum, cerebrospinal fluid, lymph, tears, saliva,milk, pus, and tissue exudates and secretions) or from in vitro cellculture constituents, as well as samples obtained from, for example, alaboratory procedure.

As used herein, the term “ortholog” denotes a gene or polypeptideobtained from one species that has homology to an analogous gene orpolypeptide from a different species. This is in contrast to “paralog”,which denotes a gene or polypeptide obtained from a given species thathas homology to a distinct gene or polypeptide from that same species.

Standard reference works setting forth the general principles ofrecombinant DNA technology include J. Sambrook et al., 1989, MolecularCloning: A Laboratory Manual, 2d Ed., Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.; P. B. Kaufman et al., (eds), 1995,Handbook of Molecular and Cellular Methods in Biology and Medicine, CRCPress, Boca Raton; M. J. McPherson (ed), 1991, Directed Mutagenesis: APractical Approach, IRL Press, Oxford; J. Jones, 1992, Amino Acid andPeptide Synthesis, Oxford Science Publications, Oxford; B. M. Austen andO. M. R. Westwood, 1991, Protein Targeting and Secretion, IRL Press,Oxford; D. N Glover (ed), 1985, DNA Cloning, Volumes I and II; M. J.Gait (ed), 1984, Oligonucleotide Synthesis; B. D. Hames and S. J.Higgins (eds), 1984, Nucleic Acid Hybridization; Wu and Grossman (eds),Methods in Enzymology (Academic Press, Inc.), Vol. 154 and Vol. 155;Quirke and Taylor (eds), 1991, PCR—A Practical Approach; Hames andHiggins (eds), 1984, Transcription and Translation; R. I. Freshney (ed),1986, Animal Cell Culture; Immobilized Cells and Enzymes, 1986, IRLPress; Perbal, 1984, A Practical Guide to Molecular Cloning; J. H.Miller and M. P. Calos (eds), 1987, Gene Transfer Vectors for MammalianCells, Cold Spring Harbor Laboratory Press; M. J. Bishop (ed), 1998,Guide to Human Genome Computing, 2d Ed., Academic Press, San Diego,Calif.; L. F. Peruski and A. H. Peruski, 1997, The Internet and the NewBiology: Tools for Genomic and Molecular Research, American Society forMicrobiology, Washington, D.C.

Standard reference works setting forth the general principles ofimmunology include S. Sell, 1996, Immunology, Immunopathology &Immunity, 5th Ed., Appleton & Lange, Publ., Stamford, Conn.; D. Male etal., 1996, Advanced Immunology, 3d Ed., Times Mirror Intl PublishersLtd., Publ., London; D. P. Stites and A. I. Terr, 1991, Basic andClinical Immunology, 7th Ed., Appleton & Lange, Publ., Norwalk, Conn.;and A. K. Abbas et al., 1991, Cellular and Molecular Immunology, W. B.Saunders Co., Publ., Philadelphia, Pa. Any suitable materials and/ormethods known to those of skill can be utilized in carrying out thepresent invention; however, preferred materials and/or methods aredescribed. Materials, reagents, and the like to which reference is madein the following description and examples are generally obtainable fromcommercial sources, and specific vendors are cited herein.

Nucleic Acids

The present invention relates to nucleic acids from chromosome12q23-qter genes (Table 4; e.g., SEQ ID NO: 1 to SEQ ID NO:92, genomicDNA within BAC end sequences (e.g., SEQ ID NO:156 to SEQ ID NO:693), andgenomic DNA of BAC sequences (e.g., SEQ ID NO:694 to SEQ ID NO:979),direct selected sequences (e.g., SEQ ID NO:980 to SEQ ID NO:1766),clusters (e.g., SEQ ID NO:1767 to SEQ ID NO:4687), RNA, fragments of thegenomic, cDNA, or RNA nucleic acids comprising 20, 40, 60, 100, 200, 500or more contiguous nucleotides, and the complements thereof. Closelyrelated variants are also included as part of this invention, as well asrecombinant nucleic acids comprising at least 50, 60, 70, 80, or 90% ofthe nucleic acids described above which would be identical to nucleicacids from chromosome 12q23-qter genes except for one or a fewsubstitutions, deletions, or additions.

Further, the nucleic acids of this invention include the adjacentchromosomal regions of chromosome 12q23-qter genes required for accurateexpression of the respective gene. In a preferred embodiment, thepresent invention is directed to at least 15 contiguous nucleotides ofthe nucleic acid sequence of any of SEQ ID NO:1 to SEQ ID NO:92 and SEQID NO:156 to SEQ ID NO:4687. More particularly, embodiments of thisinvention include the BAC clones containing segments of chromosome12q23-qter genes including RPCI-11_(—)0899A17, RPCI-11_(—)0666B20,RPCI-11_(—)0723P10, RPCI-11_(—)0831E18, RPCI-11_(—)0932D22, andRPCI-11_(—)0702C13. A preferred embodiment is the nucleotide sequence ofthe BAC clones consisting of SEQ ID NO:694 to SEQ ID NO:979 and thoselisted in Table 3. Another embodiment is the nucleotide sequence of theBAC end sequences of SEQ ID NO:156 to SEQ ID NO:693.

The invention also relates to direct selected clones and EST's from the12q23-qter (e.g., SEQ ID NO:1 to SEQ ID NO:92). In a preferredembodiment, the invention relates to clusters of nucleic acids combiningthe direct selected clones with EST's homologous to the BAC sequencesand BAC end sequences (SEQ ID NO:1675 to SEQ ID NO:4594).

The invention also concerns the use of the nucleotide sequence of thenucleic acids of this invention to identify DNA probes for genes of12q23-qter (SEQ ID NO:1 to SEQ ID NO:92), BAC end sequences (SEQ IDNO:156 to SEQ ID NO:693), BACs (SEQ ID NO:694 to SEQ ID NO:979), directselected clones (SEQ ID NO:980 to SEQ ID NO:1766), and sequence clusters(SEQ ID NO:1767 to SEQ ID NO:4687), PCR primers to amplify the genes of12q23-qter, nucleotide polymorphisms (Table 10), and regulatory elementsof the genes of 12q23-qter.

This invention further relates to methods of using isolated and/orrecombinant 12q23-qter nucleic acids (DNA or RNA) that are characterizedby their ability to hybridize to (a) a nucleic acid encoding a proteinor polypeptide, such as a nucleic acid having any of the sequences SEQID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687, or (b) afragment of the foregoing (e.g., any of the nucleotide sequences setforth in Tables 8, 9, 11A and 11B). For example, a fragment can comprisethe minimum nucleotides of a chromosome 12q23-qter protein required toencode a functional chromosome 12q23-qter protein, or the minimumnucleotides to encode a polypeptide having the amino acid sequence ofSEQ ID NO:93 to SEQ ID NO:155, or to encode a functional equivalentthereof. A functional equivalent can include a polypeptide, which, whenincorporated into a cell, has all or part of the activity of achromosome 12q23-qter protein. A functional equivalent of a chromosome12q23-qter protein, therefore, would have a similar amino acid sequence(at least 65% sequence identity) and similar characteristics to, orperform in substantially the same way as a chromosome 12q23-qterprotein. A nucleic acid which hybridizes to a nucleic acid encoding achromosome 12q23-qter protein or polypeptide, such as SEQ ID NO:93 toSEQ ID NO:155, can be double- or single-stranded. Hybridization to DNA,such as DNA having a sequence set forth in SEQ ID NO:1 to SEQ ID NO:92,SEQ ID NO:156 to SEQ ID NO:4687, Tables 8, 9, 11A, and 11B, includeshybridization to the strand shown, or to the complementary strand.

The sequences of the present invention may be derived from a variety ofsources including DNA, cDNA, synthetic DNA, synthetic RNA, orcombinations thereof. Such sequences may comprise genomic DNA, which mayor may not include naturally occurring introns. Moreover, such genomicDNA may be obtained in association with promoter regions or poly (A)sequences. The sequences, genomic DNA, or cDNA may be obtained in any ofseveral ways. Genomic DNA can be extracted and purified from suitablecells by means well known in the art. Alternatively, mRNA can beisolated from a cell and used to produce cDNA by reverse transcriptionor other means.

The present invention also relates to nucleic acids that encode apolypeptide having the amino acid sequence of any one of SEQ ID NO:93 toSEQ ID NO:155, or functional equivalents thereof. A functionalequivalent of a 12q23-qter protein includes fragments or variants thatperform at least on characteristic function of the 12q23-qter protein(e.g., antigenic or intracellular activity). Preferably, a functionalequivalent will share at least 65% sequence identity with the 12q23-qterpolypeptide.

Sequence identity calculations can be performed using computer programs,hybridization methods, or calculations. Preferred computer programmethods to determine identity and similarity between two sequencesinclude, but are not limited to, the GCG program package, BLASTN,BLASTX, TBLASTX, and FASTA (J. Devereux et al., 1984, Nucleic AcidsResearch 12(1):387; S. F. Altschul et al., 1990, J. Molec. Biol.215:403-410; W. Gish and D. J. States, 1994, Nature Genet. 3:266-272; W.R. Pearson and D. J. Lipman, 1988, Proc Natl. Acad. Sci. USA85(8):2444-8). The BLAST programs are publicly available from NCBI andother sources. The well-known Smith Waterman algorithm may also be usedto determine identity.

For example, nucleotide sequence identity can be determined by comparinga query sequences to sequences in publicly available sequence databases(NCBI) using the BLASTN2 algorithm (S. F. Altschul et al., 1997, Nucl.Acids Res., 25:3389-3402). The parameters for a typical search are:E=0.05, v=50, B=50, wherein E is the expected probability score cutoff,V is the number of database entries returned in the reporting of theresults, and B is the number of sequence alignments returned in thereporting of the results (S. F. Altschul et al., 1990, J. Mol. Biol.,215:403-410).

In another approach, nucleotide sequence identity can be calculatedusing the following equation: % identity=(number of identicalnucleotides)/(alignment length in nucleotides)*100. For thiscalculation, alignment length includes internal gaps but not includesterminal gaps. Alternatively, nucleotide sequence identity can bedetermined experimentally using the specific hybridization conditionsdescribed below.

In accordance with the present invention, polynucleotide alterations areselected from the group consisting of at least one nucleotide deletion,substitution, including transition and transversion, insertion, ormodification (e.g., via RNA or DNA analogs). Alterations may occur atthe 5′ or 3′ terminal positions of the reference nucleotide sequence oranywhere between those terminal positions, interspersed eitherindividually among the nucleotides in the reference sequence or in oneor more contiguous groups within the reference sequence. Alterations ofa polynucleotide sequence of any one of SEQ ID NO:1 to SEQ ID NO:92 andSEQ ID NO:156 to SEQ ID NO:4687 may create nonsense, missense, orframeshift mutations in this coding sequence, and thereby alter thepolypeptide encoded by the polynucleotide following such alterations.

Such altered nucleic acids, including DNA or RNA, can be detected andisolated by hybridization under high stringency conditions or moderatestringency conditions, for example, which are chosen to preventhybridization of nucleic acids having non-complementary sequences.“Stringency conditions” for hybridizations is a term of art which refersto the conditions of temperature and buffer concentration which permithybridization of a particular nucleic acid to another nucleic acid inwhich the first nucleic acid may be perfectly complementary to thesecond, or the first and second may share some degree of complementaritywhich is less than perfect.

For example, certain high stringency conditions can be used whichdistinguish perfectly complementary nucleic acids from those of lesscomplementarity. “High stringency conditions” and “moderate stringencyconditions” for nucleic acid hybridizations are explained in F. M.Ausubel et al. (eds), 1995, Current Protocols in Molecular Biology, JohnWiley and Sons, Inc., New York, N.Y., the teachings of which are herebyincorporated by reference. In particular, see pages 2.10.1-2.10.16(especially pages 2.10.8-2.10.11) and pages 6.3.1-6.3.6. The exactconditions which determine the stringency of hybridization depend notonly on ionic strength, temperature and the concentration ofdestabilizing agents such as formamide, but also on factors such as thelength of the nucleic acid sequence, base composition, percent mismatchbetween hybridizing sequences and the frequency of occurrence of subsetsof that sequence within other non-identical sequences. Thus, high ormoderate stringency conditions can be determined empirically.

By varying hybridization conditions from a level of stringency at whichno hybridization occurs to a level at which hybridization is firstobserved, conditions which will allow a given sequence to hybridize withthe most similar sequences in the sample can be determined. Preferablythe hybridizing sequences will have 60-70% sequence identity, morepreferably 70-85% sequence identity, and even more preferably 90-100%sequence identity.

Typically, the hybridization reaction is initially performed underconditions of low stringency, followed by washes of varying, but higherstringency. Reference to hybridization stringency, e.g., high, moderate,or low stringency, typically relates to such washing conditions.Hybridization conditions are based on the melting temperature (T_(m)) ofthe nucleic acid probe or primer and are typically classified by degreeof stringency of the conditions under which hybridization is measured(Ausubel et al., 1995). For example, high stringency hybridizationtypically occurs at about 5-10% C below the T_(m); moderate stringencyhybridization occurs at about 10-20% below the T_(m); and low stringencyhybridization occurs at about 20-25% below the T_(m). The meltingtemperature can be approximated by the formulas as known in the art,depending on a number of parameters, such as the length of the hybrid orprobe in number of nucleotides, or hybridization buffer ingredients andconditions. As a general guide, T_(m) decreases approximately 1° C. withevery 1% decrease in sequence identity at any given SSC concentration.Generally, doubling the concentration of SSC results in an increase inT_(m) of −17° C. Using these guidelines, the washing temperature can bedetermined empirically for moderate or low stringency, depending on thelevel of mismatch sought.

High stringency hybridization conditions are typically carried out at 65to 68° C. in 0.1×SSC and 0.1% SDS. Highly stringent conditions allowhybridization of nucleic acid molecules having about 95 to 100% sequenceidentity. Moderate stringency hybridization conditions are typicallycarried out at 50 to 65° C. in 1×SSC and 0.1% SDS. Moderate stringencyconditions allow hybridization of sequences having at least 80 to 95%nucleotide sequence identity. Low stringency hybridization conditionsare typically carried out at 40 to 50° C. in 6×SSC and 0.1% SDS. Lowstringency hybridization conditions allow detection of specifichybridization of nucleic acid molecules having at least 50 to 80%nucleotide sequence identity.

For example, high stringency conditions can be attained by hybridizationin 50% formamide, 5×Denhardt's solution, 5×SSPE or SSC (1×SSPE buffercomprises 0.15 M NaCl, 10 mM Na₂HPO₄, 1 mM EDTA; 1×SSC buffer comprises150 mM NaCl, 15 mM sodium citrate, pH 7.0), 0.2% SDS at about 42° C.,followed by washing in 1×SSPE or SSC and 0.1% SDS at a temperature of atleast 42° C., preferably about 55° C., more preferably about 65° C.Moderate stringency conditions can be attained, for example, byhybridization in 50% formamide, 5×Denhardt's solution, 5×SSPE or SSC,and 0.2% SDS at 42° C. to about 50° C., followed by washing in 0.2×SSPEor SSC and 0.2% SDS at a temperature of at least 42° C., preferablyabout 55° C., more preferably about 65° C. Low stringency conditions canbe attained, for example, by hybridization in 10% formamide,5×Denhardt's solution, 6×SSPE or SSC, and 0.2% SDS at 42° C., followedby washing in 1×SSPE or SSC, and 0.2% SDS at a temperature of about 45°C., preferably about 50° C. in 4×SSC at 60° C. for 30 min.

High stringency hybridization procedures typically (1) employ low ionicstrength and high temperature for washing, such as 0.015 M NaCl/0.0015 Msodium citrate, pH 7.0 (0.1×SSC) with 0.1% sodium dodecyl sulfate (SDS)at 50° C.; (2) employ during hybridization 50% (vol/vol) formamide with5×Denhardt's solution (0.1% weight/volume highly purified bovine serumalbumin/0.1% wt/vol Ficoll/0.1% wt/vol polyvinylpyrrolidone), 50 mMsodium phosphate buffer at pH 6.5 and 5×SSC at 42° C.; or (3) employhybridization with 50% formamide, 5×SSC, 50 mM sodium phosphate (pH6.8), 0.1% sodium pyrophosphate, 5×Denhardt's solution, sonicated salmonsperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42° C., withwashes at 42° C. in 0.2×SSC and 0.1% SDS.

In one particular embodiment, high stringency hybridization conditionsmay be attained by:

-   -   Prehybridization treatment of the support (e.g., nitrocellulose        filter or nylon membrane), to which is bound the nucleic acid        capable of hybridizing with any of the sequences of the        invention, is carried out at 65° C. for 6 hr with a solution        having the following composition: 4×SSC, 10×Denhardt's        (1×Denhardt's comprises 1% Ficoll, 1% polyvinylpyrrolidone, 1%        BSA (bovine serum albumin); 1×SSC comprises of 0.15 M of NaCl        and 0.015 M of sodium citrate, pH 7);    -   Replacement of the pre-hybridization solution in contact with        the support by a buffer solution having the following        composition: 4×SSC, 1×Denhardt's, 25 mM NaPO₄, pH 7, 2 mM EDTA,        0.5% SDS, 100 μg/ml of sonicated salmon sperm DNA containing a        nucleic acid derived from the sequences of the invention as        probe, in particular a radioactive probe, and previously        denatured by a treatment at 100° C. for 3 min;    -   Incubation for 12 hr at 65° C.;    -   Successive washings with the following solutions: 1) four        washings with 2×SSC, 1×Denhardt's, 0.5% SDS for 45 min at 65°        C.; 2) two washings with 0.2×SSC, 0.1×SSC for 45 min at 65° C.;        and 3) 0.1×SSC, 0.1% SDS for 45 min at 65° C.

Additional examples of high, medium, and low stringency conditions canbe found in Sambrook et al., 1989. Exemplary conditions are alsodescribed in M. H. Krause and S. A. Aaronson, 1991, Methods inEnzymology, 200:546-556; Ausubel et al., 1995. It is to be understoodthat the low, moderate and high stringency hybridization/washingconditions may be varied using a variety of ingredients, buffers, andtemperatures well known to and practiced by the skilled practitioner.

Isolated and/or recombinant nucleic acids that are characterized bytheir ability to hybridize to a) a nucleic acid encoding a chromosome12q23-qter polypeptide, such as the nucleic acids depicted as SEQ IDNO:1 to SEQ ID NO:92; b) the complement of (a); c) or a portion of (a)or (b) (e.g., under high or moderate stringency conditions), may furtherencode a protein or polypeptide having at least one functioncharacteristic of a chromosome 12q23-qter polypeptide, such as Gene 702,a metalloprototease-like gene involved in inflammatory responsesincluding tissue destruction and repair, or binding of antibodies thatalso bind to non-recombinant chromosome 12q23-qter proteins orpolypeptides. The catalytic or binding function of a protein orpolypeptide encoded by the hybridizing nucleic acid may be detected bystandard enzymatic assays for activity or binding (e.g., assays thatmeasure the binding of a transit peptide or a precursor, or othercomponents of the translocation machinery). Enzymatic assays,complementation tests, or other suitable methods can also be used inprocedures for the identification and/or isolation of nucleic acidswhich encode a polypeptide such as a polypeptide of the amino acidsequences SEQ ID NO:93 to SEQ ID NO:155, or a functional equivalent ofthese polypeptides. The antigenic properties of proteins or polypeptidesencoded by hybridizing nucleic acids can be determined by immunologicalmethods employing antibodies that bind to a chromosome 12q23-qterpolypeptide such as immunoblot, immunoprecipitation andradioimmunoassay. PCR methodology, including RAGE (Rapid Amplificationof Genomic DNA Ends), can also be used to screen for and detect thepresence of nucleic acids which encode chromosome 12q23-qter gene-likeproteins and polypeptides, and to assist in cloning such nucleic acidsfrom genomic DNA. PCR methods for these purposes can be found in Innis,M. A., et al., 1990, PCR Protocols: A Guide to Methods and Applications,Academic Press, Inc., San Diego, Calif., incorporated herein byreference.

It is understood that, as a result of the degeneracy of the geneticcode, many nucleic acid sequences are possible which encode a chromosome12q23-qter gene-like protein or polypeptide. Some of these will havelittle homology to the nucleotide sequences of any known ornaturally-occurring chromosome 12q23-qter gene-like gene but can be usedto produce the proteins and polypeptides of this invention by selectionof combinations of nucleotide triplets based on codon choices. Suchvariants, while not hybridizable to a naturally-occurring chromosome12q23-qter gene, are contemplated within this invention.

Also encompassed by the present invention are alternate splice variantsproduced by differential processing of the primary transcript(s) from12q23-qter genomic DNA. An alternate splice variant may comprise, forexample, the sequence of any one of SEQ ID NO:1 to SEQ ID NO:5; SEQ IDNO:17 to SEQ ID NO:18; SEQ ID NO:36 to SEQ ID NO:37; SEQ ID NO:43 to SEQID NO:44; and SEQ ID NO:80 to SEQ ID NO:81. Alternate splice variantscan also comprise other combinations of introns/exons of 12q23-qtergenes, which can be determined by those of skill in the art. Alternatesplice variants can be determined experimentally, for example, byisolating and analyzing cellular RNAs (e.g., Southern blotting or PCR),or by screening cDNA libraries using the 12q23-qter nucleic acid probesor primers described herein. In another approach, alternate splicevariants can be predicted using various methods, computer programs, orcomputer systems available to practitioners in the field.

General methods for splice site prediction can be found in Nakata, 1985,Nucleic Acids Res. 13:5327-5340. In addition, splice sites can bepredicted using, for example, the GRAIL™ (E. C. Uberbacher and R. J.Mural, 1991, Proc. Natl. Acad. Sci. USA, 88:11261-11265; E. C.Uberbacher, 1995, Trends Biotech., 13:497-500); GenView (L. Milanesi etal., 1993, Proceedings of the Second International Conference onBioinformatics, Supercomputing, and Complex Genome Analysis, H. A. Limet al. (eds), World Scientific Publishing, Singapore, pp. 573-588);SpliceView (The Institute of Biomedical Technologies I.T.B.; Italy); andHSPL (V. V. Solovyev et al., 1994, Nucleic Acids Res. 22:5156-5163; V.V. Solovyev et al., 1994, “The Prediction of Human Exons byOligonucleotide Composition and Discriminant Analysis of Spliceable OpenReading Frames,” R. Altman et al. (eds), The Second Internationalconference on Intelligent systems for Molecular Biology, AAAI Press,Menlo Park, Calif., pp. 354-362; V. V. Solovyev et al., 1993,“Identification Of Human Gene Functional Regions Based OnOligonucleotide Composition,” L. Hunter et al. (eds), In Proceedings ofFirst International conference on Intelligent System for MolecularBiology, Bethesda, pp. 371-379) computer systems.

Additionally, computer programs such as GeneParser (E. E. Snyder and G.D. Stormo, 1995, J. Mol. Biol. 248: 1-18; E. E. Snyder and G. D. Stormo,1993, Nucl. Acids Res. 21(3): 607-613; Boulder, Colo.); MZEF (M. Q.Zhang, 1997, Proc. Natl. Acad. Sci. USA, 94:565-568 Cold Spring HarborLaboratory; Cold Spring Harbor, N.Y.); MORGAN (S. Salzberg et al., 1998,J. Comp. Biol. 5:667-680; S. Salzberg et al. (eds), 1998, ComputationalMethods in Molecular Biology, Elsevier Science, New York, N.Y., pp.187-203); VEIL (J. Henderson et al., 1997, J. Comp. Biol. 4:127-141);GeneScan (S. Tiwari et al., 1997, CABIOS (BioInformatics) 13: 263-270);GeneBuilder (L. Milanesi et al., 1999, Bioinformatics 15:612-621);Eukaryotic GeneMark (J. Besemer et al., 1999, Nucl. Acids Res.27:3911-3920); and FEXH (V. V. Solovyev et al., 1994, Nucleic Acids Res.22:5156-5163). In addition, splice sites (i.e., former or potentialsplice sites) in cDNA sequences can be predicted using, for example, theRNASPL (V. V. Solovyev et al., 1994, Nucleic Acids Res. 22:5156-5163);or INTRON (A. Globek et al., 1991, INTRON version 1.1 manual, Laboratoryof Biochemical Genetics, NIMH, Washington, D.C.) programs.

The present invention also encompasses naturally-occurring polymorphismsof 12q23-qter genes. As will be understood by those in the art, thegenomes of all organisms undergo spontaneous mutation in the course oftheir continuing evolution generating variant forms of gene sequences(Gusella, 1986, Ann. Rev. Biochem. 55:831-854). Restriction fragmentlength polymorphisms (RFLPs) include variations in DNA sequences thatalter the length of a restriction fragment in the sequence (Botstein etal., 1980, Am. J. Hum. Genet. 32, 314-331). RFLPs have been widely usedin human and animal genetic analyses (see WO 90/13668; WO90/11369;Donis-Keller, 1987, Cell 51:319-337; Lander et al., 1989, Genetics 121:85-99). Short tandem repeats (STRs) include tandem di-, tri- andtetranucleotide repeated motifs, also termed variable number tandemrepeat (VNTR) polymorphisms. VNTRs have been used in identity andpaternity analysis (U.S. Pat. No. 5,075,217; Armour et al., 1992, FEBSLett. 307:113-115; Horn et al., WO 91/14003; Jeffreys, EP 370,719), andin a large number of genetic mapping studies.

Single nucleotide polymorphisms (SNPs) are far more frequent than RFLPS,STRs, and VNTRs. SNPs may occur in protein coding (e.g., exon), ornon-coding (e.g., intron, 5′UTR, 3′UTR) sequences. SNPs in proteincoding regions may comprise silent mutations that do not alter the aminoacid sequence of a protein. Alternatively, SNPs in protein codingregions may produce conservative or non-conservative amino acid changes,described in detail below. In some cases, SNPs may give rise to theexpression of a defective or other variant protein and, potentially, agenetic disease. SNPs within protein-coding sequences can give rise togenetic diseases, for example, in the β-globin (sickle cell anemia) andCFTR (cystic fibrosis) genes. In non-coding sequences, SNPs may alsoresult in defective protein expression (e.g., as a result of defectivesplicing). Other single nucleotide polymorphisms have no phenotypiceffects.

Single nucleotide polymorphisms can be used in the same manner as RFLPsand VNTRs, but offer several advantages. Single nucleotide polymorphismstend to occur with greater frequency and are typically spaced moreuniformly throughout the genome than other polymorphisms. Also,different SNPs are often easier to distinguish than other types ofpolymorphisms (e.g., by use of assays employing allele-specifichybridization probes or primers). In one embodiment of the presentinvention, a 12q23-qter nucleic acid contains at least one SNP as setforth in Table 10, FIGS. 7A-7H; FIGS. 9A-9F; FIGS. 27A-27K; and FIGS.28A-28C, described herein. Various combinations of these SNPs are alsoencompassed by the invention. In a preferred aspect, a 12q23-qter SNP isassociated with a lung-related disorder, such as asthma. Nucleic acidscomprising such SNPs can be used as diagnostic and/or therapeuticreagents.

The nucleic acid sequences of the present invention may be derived froma variety of sources including DNA, cDNA, synthetic DNA, synthetic RNA,or combinations thereof. Such sequences may comprise genomic DNA, whichmay or may not include naturally occurring introns. Moreover, suchgenomic DNA may be obtained in association with promoter regions orpoly(A)+ sequences. The sequences, genomic DNA, or cDNA may be obtainedin any of several ways. Genomic DNA can be extracted and purified fromsuitable cells by means well known in the art. Alternatively, mRNA canbe isolated from a cell and used to produce cDNA by reversetranscription or other means.

The nucleic acids described herein are used in the methods of thepresent invention for production of proteins or polypeptides, throughincorporation into cells, tissues, or organisms. In one embodiment, DNAcontaining all or part of the coding sequence for a 12q23-qterpolypeptide, or DNA which hybridizes to DNA having the sequence of anyone of SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687,or a fragment thereof, is incorporated into a vector for expression ofthe encoded polypeptide in suitable host cells. The encoded amino acidsequence consisting of a 12q23-qter polypeptide, or its functionalequivalent is capable of normal activity, such as antigenic orintracellular activity.

The invention also concerns the use of the nucleotide sequence of thenucleic acids of this invention to identify DNA probes for 12q23-qtergenes, PCR primers to amplify 12q23-qter genes, nucleotide polymorphismsin 12q23-qter genes, and regulatory elements of 12q23-qter genes.

The nucleic acids of the present invention find use as primers andtemplates for the recombinant production of disorder-associated peptidesor polypeptides, for chromosome and gene mapping, to provide antisensesequences, for tissue distribution studies, to locate and obtain fulllength genes, to identify and obtain homologous sequences (wild-type andmutants), and in diagnostic applications. The primers of this inventionmay comprise all or a portion of the nucleotide sequence of any one ofSEQ ID NO:1 to SEQ ID NO:92, SEQ ID NO:156 to SEQ ID NO:4687, and thesequences set forth in Tables 8, 9, 11A, and 11B, or a complementarysequence thereof.

Probes may also be used for the detection of 12q23-qter-relatedsequences, and should preferably contain at least 50%, preferably atleast 80%, identity to a 12q23-qter polynucleotide, or a complementarysequence, or fragments thereof. The probes of this invention may be DNAor RNA, the probes may comprise all or a portion of the nucleotidesequence of any one of SEQ ID NO:1 to SEQ ID NO:92, SEQ ID NO:156 to SEQID NO:4687, and the sequences set forth in Tables 8, 9, 11A, and 11B, ora complementary sequence thereof, and may include promoter, enhancerelements, and introns of the naturally occurring 12q23-qterpolynucleotide.

The probes and primers based on the 12q23-qter gene sequences disclosedherein are used to identify homologous 12q23-qter gene sequences andproteins in other species. These 12q23-qter gene sequences and proteinsare used in the diagnostic/prognostic, therapeutic and drug-screeningmethods described herein for the species from which they have beenisolated.

Vectors and Host Cells

The nucleic acids described herein are used in the methods of thepresent invention for production of proteins or polypeptides, throughincorporation into cells, tissues, or organisms. In one embodiment, DNAcontaining all or part of the coding sequence for a chromosome12q23-qter polypeptide, or DNA which hybridizes to DNA having thesequence SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO: 156 to SEQ IDNO:4687, is incorporated into a vector for expression of the encodedpolypeptide in suitable host cells. The encoded polypeptides consistingof chromosome 12q23-qter genes, or their functional equivalents arecapable of normal activity, such as Gene 702, a metalloprotease-likegene involved in inflammatory responses including tissue destruction andrepair. A large number of vectors, including bacterial, yeast, andmammalian vectors, have been described for replication and/or expressionin various host cells or cell-free systems, and may be used for genetherapy as well as for simple cloning or protein expression.

In one aspect, an expression vectors comprises a nucleic acid encoding a12q23-qter polypeptide or peptide, as described herein, operably linkedto at least one regulatory sequence. Regulatory sequences are known inthe art and are selected to direct expression of the desired protein inan appropriate host cell. Accordingly, the term regulatory sequenceincludes promoters, enhancers and other expression control elements (seeD. V. Goeddel, 1990, Methods Enzymol. 185:3-7). Enhancer and otherexpression control sequences are described in Enhancers and EukaryoticGene Expression, 1983, Cold Spring Harbor Press, Cold Spring Harbor,N.Y. It should be understood that the design of the expression vectormay depend on such factors as the choice of the host cell to betransfected and/or the type of polypeptide to be expressed.

Several regulatory elements (e.g., promoters) have been isolated andshown to be effective in the transcription and translation ofheterologous proteins in the various hosts. Such regulatory regions,methods of isolation, manner of manipulation, etc. are known in the art.Non-limiting examples of bacterial promoters include the β-lactamase(penicillinase) promoter; lactose promoter; tryptophan (trp) promoter;araBAD (arabinose) operon promoter; lambda-derived P₁ promoter and Ngene ribosome binding site; and the hybrid tac promoter derived fromsequences of the trp and lac UV5 promoters. Non-limiting examples ofyeast promoters include the 3-phosphoglycerate kinase promoter,glyceraldehyde-3-phosphate dehydrogenase (GAPDH) promoter, galactokinase(GAL1) promoter, galactoepimerase promoter, and alcohol dehydrogenase(ADH1) promoter. Suitable promoters for mammalian cells include, withoutlimitation, viral promoters, such as those from Simian Virus 40 (SV40),Rous sarcoma virus (RSV), adenovirus (ADV), and bovine papilloma virus(BPV). Preferred replication and inheritance systems include M13, ColE1,SV40, baculovirus, lambda, adenovirus, CEN ARS, 2 μm ARS and the like.While expression vectors may replicate autonomously, they may alsoreplicate by being inserted into the genome of the host cell, by methodswell known in the art.

To obtain expression in eukaryotic cells, terminator sequences,polyadenylation sequences, and enhancer sequences that modulate geneexpression may be required. Sequences that cause amplification of thegene may also be desirable. These sequences are well known in the art.Furthermore, sequences that facilitate secretion of the recombinantproduct from cells, including, but not limited to, bacteria, yeast, andanimal cells, such as secretory signal sequences and/or preprotein orproprotein sequences, may also be included. Such sequences are welldescribed in the art.

Expression and cloning vectors will likely contain a selectable marker,a gene encoding a protein necessary for survival or growth of a hostcell transformed with the vector. The presence of this gene ensuresgrowth of only those host cells that express the inserts. Typicalselection genes encode proteins that 1) confer resistance to antibioticsor other toxic substances, e.g., ampicillin, neomycin, methotrexate,etc.; 2) complement auxotrophic deficiencies, or 3) supply criticalnutrients not available from complex media, e.g., the gene encodingD-alanine racemase for Bacilli. Markers may be an inducible ornon-inducible gene and will generally allow for positive selection.Non-limiting examples of markers include the ampicillin resistancemarker (i.e., beta-lactamase), tetracycline resistance marker,neomycin/kanamycin resistance marker (i.e., neomycinphosphotransferase), dihydrofolate reductase, glutamine synthetase, andthe like. The choice of the proper selectable marker will depend on thehost cell, and appropriate markers for different hosts as understood bythose of skill in the art.

Suitable expression vectors for use with the present invention include,but are not limited to, pUC, pBluescript (Stratagene), pET (Novagen,Inc., Madison, Wis.), and pREP (Invitrogen) plasmids. Vectors cancontain one or more replication and inheritance systems for cloning orexpression, one or more markers for selection in the host, e.g.,antibiotic resistance, and one or more expression cassettes. Theinserted coding sequences can be synthesized by standard methods,isolated from natural sources, or prepared as hybrids. Ligation of thecoding sequences to transcriptional regulatory elements (e.g.,promoters, enhancers, and/or insulators) and/or to other amino acidencoding sequences can be carried out using established methods.

Suitable cell-free expression systems for use with the present inventioninclude, without limitation, rabbit reticulocyte lysate, wheat germextract, canine pancreatic microsomal membranes, E. coli S30 extract,and coupled transcription/translation systems (Promega Corp., Madison,Wis.). These systems allow the expression of recombinant polypeptides orpeptides upon the addition of cloning vectors, DNA fragments, or RNAsequences containing protein-coding regions and appropriate promoterelements.

Non-limiting examples of suitable host cells include bacteria, archea,insect, fungi (e.g., yeast), plant, and animal cells (e.g., mammalian,especially human). Of particular interest are Escherichia coli, Bacillussubtilis, Saccharomyces cerevisiae, SF9 cells, C129 cells, 293 cells,Neurospora, and immortalized mammalian myeloid and lymphoid cell lines.Techniques for the propagation of mammalian cells in culture arewell-known (see, Jakoby and Pastan (eds), 1979, Cell Culture. Methods inEnzymology, volume 58, Academic Press, Inc., Harcourt Brace Jovanovich,N.Y.). Examples of commonly used mammalian host cell lines are VERO andHeLa cells, CHO cells, and WI38, BHK, and COS cell lines, although itwill be appreciated by the skilled practitioner that other cell linesmay be used, e.g., to provide higher expression desirable glycosylationpatterns, or other features.

Host cells can be transformed, transfected, or infected as appropriateby any suitable method including electroporation, calcium chloride-,lithium chloride-, lithium acetate/polyethylene glycol-, calciumphosphate-, DEAE-dextran-, liposome-mediated DNA uptake, spheroplasting,injection, microinjection, microprojectile bombardment, phage infection,viral infection, or other established methods. Alternatively, vectorscontaining the nucleic acids of interest can be transcribed in vitro,and the resulting RNA introduced into the host cell by well-knownmethods, e.g., by injection (see, Kubo et al., 1988, FEBS Letts.241:119). The cells into which have been introduced nucleic acidsdescribed above are meant to also include the progeny of such cells.

The nucleic acids of the invention may be isolated directly from cells.Alternatively, the polymerase chain reaction (PCR) method can be used toproduce the nucleic acids of the invention, using either RNA (e.g.,mRNA) or DNA (e.g., genomic DNA) as templates. Primers used for PCR canbe synthesized using the sequence information provided herein and canfurther be designed to introduce appropriate new restriction sites, ifdesirable, to facilitate incorporation into a given vector forrecombinant expression.

Using the information provided in SEQ ID NO:1 to SEQ ID NO:92 and SEQ IDNO:156 to SEQ ID NO:4687, one skilled in the art will be able to cloneand sequence all representative nucleic acids of interest, includingnucleic acids encoding complete protein-coding sequences. It is to beunderstood that non-protein-coding sequences contained within SEQ IDNO:156 to SEQ ID NO:693 and SEQ ID NO:694 to SEQ ID NO:979 are alsowithin the scope of the invention. Such sequences include, withoutlimitation, sequences important for replication, recombination,transcription, and translation. Non-limiting examples include promotersand regulatory binding sites involved in regulation of gene expression,and 5′- and 3′-untranslated sequences (e.g., ribosome-binding sites)that form part of mRNA molecules.

The nucleic acids of this invention can be produced in large quantitiesby replication in a suitable host cell. Natural or synthetic nucleicacid fragments, comprising at least ten contiguous bases coding for adesired peptide or polypeptide can be incorporated into recombinantnucleic acid constructs, usually DNA constructs, capable of introductioninto and replication in a prokaryotic or eukaryotic cell. Usually thenucleic acid constructs will be suitable for replication in aunicellular host, such as yeast or bacteria, but may also be intendedfor introduction to (with and without integration within the genome)cultured mammalian or plant or other eukaryotic cells, cell lines,tissues, or organisms. The purification of nucleic acids produced by themethods of the present invention is described, for example, in Sambrooket al., 1989; F. M. Ausubel et al., 1992, Current Protocols in MolecularBiology, J. Wiley and Sons, New York, N.Y.

The nucleic acids of the present invention can also be produced bychemical synthesis, e.g., by the phosphoramidite method described byBeaucage et al., 1981, Tetra. Letts. 22:1859-1862, or the triestermethod according to Matteucci et al., 1981, J. Am. Chem. Soc., 103:3185,and can performed on commercial, automated oligonucleotide synthesizers.A double-stranded fragment may be obtained from the single-strandedproduct of chemical synthesis either by synthesizing the complementarystrand and annealing the strands together under appropriate conditionsor by adding the complementary strand using DNA polymerase with anappropriate primer sequence.

These nucleic acids can encode full-length variant forms of proteins aswell as the wild-type protein. The variant proteins (which could beespecially useful for detection and treatment of disorders) will havethe variant amino acid sequences encoded by the polymorphisms describedin Table 10, when said polymorphisms are read so as to be in-frame withthe full-length coding sequence of which it is a component.

Large quantities of the nucleic acids and proteins of the presentinvention may be prepared by expressing the 12q23-qter nucleic acids orportions thereof in vectors or other expression vectors in compatibleprokaryotic or eukaryotic host cells. The most commonly used prokaryotichosts are strains of Escherichia coli, although other prokaryotes, suchas Bacillus subtilis or Pseudomonas may also be used. Mammalian or othereukaryotic host cells, such as those of yeast, filamentous fungi, plant,insect, or amphibian or avian species, may also be useful for productionof the proteins of the present invention. For example, insect cellsystems (i.e., lepidopteran host cells and baculovirus expressionvectors) are particularly suited for large-scale protein production.

Host cells carrying an expression vector (i.e., transformants or clones)are selected using markers depending on the mode of the vectorconstruction. The marker may be on the same or a different DNA molecule,preferably the same DNA molecule. In prokaryotic hosts, the transformantmay be selected, e.g., by resistance to ampicillin, tetracycline orother antibiotics. Production of a particular product based ontemperature sensitivity may also serve as an appropriate marker.

Prokaryotic or eukaryotic cells comprising the nucleic acids of thepresent invention will be useful not only for the production of thenucleic acids and proteins of the present invention, but also, forexample, in studying the characteristics of 12q23-qter proteins. Cellsand animals that carry a 12q23-qter gene can be used as model systems tostudy and test for substances that have potential as therapeutic agents.The cells are typically cultured mesenchymal stem cells. These may beisolated from individuals with a somatic or germline 12q23-qter gene.Alternatively, the cell line can be engineered to carry a 12q23-qtergene, as described above. After a test substance is applied to thecells, the transformed phenotype of the cell is determined. Any trait oftransformed cells can be assessed, including respiratory diseasesincluding asthma, atopy, and response to application of putativetherapeutic agents.

Antisense Nucleic Acids

A further embodiment of the invention is antisense nucleic acids oroligonucleotides which are complementary, in whole or in part, to atarget molecule comprising a sense strand, and can hybridize with thetarget molecule. The target can be DNA, or its RNA counterpart (i.e.,wherein T residues of the DNA are U residues in the RNA counterpart).When introduced into a cell, antisense nucleic acids or oligonucleotidescan inhibit the expression of the gene encoded by the sense strand orthe mRNA transcribed from the sense strand. Antisense nucleic acids canbe produced by standard techniques. See, for example, Shewmaker, et al.,U.S. Pat. No. 5,107,065.

In a particular embodiment, an antisense nucleic acid or oligonucleotideis wholly or partially complementary to and can hybridize with a targetnucleic acid (either DNA or RNA), wherein the target nucleic acid canhybridize to a nucleic acid having the sequence of the complement of thestrands in SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ IDNO:4687. For example, an antisense nucleic acid or oligonucleotide canbe complementary to a target nucleic acid having the sequence shown asthe strand of the open reading frames SEQ ID NO:1 to SEQ ID NO:92 andSEQ ID NO:156 to SEQ ID NO:4687 or nucleic acids encoding functionalequivalents of chromosome 12q23-qter genes, or to a portion of thesenucleic acids sufficient to allow hybridization. A portion, for examplea sequence of 16 nucleotides, could be sufficient to inhibit expressionof the protein. Or, an antisense nucleic acid or oligonucleotide,complementary to 5′ or 3′ untranslated regions, or overlapping thetranslation initiation codons (5′ untranslated and translated regions),of chromosome 12q23-qter genes, or genes encoding a functionalequivalent can also be effective. In another embodiment, the antisensenucleic acid is wholly or partially complementary to and can hybridizewith a target nucleic acid that encodes a chromosome 12q23-qterpolypeptide.

In addition to the antisense nucleic acids of the invention,oligonucleotides can be constructed which will bind to duplex nucleicacids either in the genes or the DNA:RNA complexes of transcription, toform stable triple helix-containing or triplex nucleic acids to inhibittranscription and/or expression of a gene encoding a chromosome12q23-qter gene, or their functional equivalents (Frank-Kamenetskii, M.D. and Mirkin, S. M., 1995, Ann. Rev. Biochem. 64:65-95). Sucholigonucleotides of the invention are constructed using the base-pairingrules of triple helix formation and the nucleotide sequences of thegenes or mRNAs for chromosome 12q23-qter genes.

In preferred embodiments, at least one of the phosphodiester bonds of anantisense oligonucleotide has been substituted with a structure thatfunctions to enhance the ability of the compositions to penetrate intothe region of cells where the RNA whose activity is to be modulated islocated. It is preferred that such substitutions comprisephosphorothioate bonds, methyl phosphonate bonds, or short chain alkylor cycloalkyl structures. In accordance with other preferredembodiments, the phosphodiester bonds are substituted with structureswhich are, at once, substantially non-ionic and non-chiral, or withstructures which are chiral and enantiomerically specific. Persons ofordinary skill in the art will be able to select other linkages for usein the practice of the invention.

Oligonucleotides may also include species that include at least somemodified base forms. Thus, purines and pyrimidines other than thosenormally found in nature may be so employed. Similarly, modifications onthe furanosyl portions of the nucleotide subunits may also be effected,as long as the essential tenets of this invention are adhered to.Examples of such modifications are 2′-O-alkyl- and2′-halogen-substituted nucleotides. Some non-limiting examples ofmodifications at the 2′ position of sugar moieties which are useful inthe present invention include OH, SH, SCH₃, F, OCH₃, OCN, O(CH₂)_(n)NH₂and O(CH₂)_(n)CH₃, where n is from 1 to about 10. Such oligonucleotidesare functionally interchangeable with natural oligonucleotides orsynthesized oligonucleotides, which have one or more differences fromthe natural structure. All such analogs are comprehended by thisinvention so long as they function effectively to hybridize with a12q23-qter nucleic acid to inhibit the function thereof.

The oligonucleotides in accordance with this invention preferablycomprise from about 3 to about 50 subunits. It is more preferred thatsuch oligonucleotides and analogs comprise from about 8 to about 25subunits and still more preferred to have from about 12 to about 20subunits. As defined herein, a “subunit” is a base and sugar combinationsuitably bound to adjacent subunits through phosphodiester or otherbonds.

Antisense nucleic acids or oligonulcleotides can be produced by standardtechniques (see, e.g., Shewmaker et al., U.S. Pat. No. 5,107,065. Theoligonucleotides used in accordance with this invention may beconveniently and routinely made through the well-known technique ofsolid phase synthesis. Equipment for such synthesis is available fromseveral vendors, including PE Applied Biosystems (Foster City, Calif.).Any other means for such synthesis may also be employed, however, theactual synthesis of the oligonucleotides is well within the abilities ofthe practitioner. It is also will known to prepare other oligonucleotidesuch as phosphorothioates and alkylated derivatives.

The oligonucleotides of this invention are designed to be hybridizablewith 12q23-qter RNA (e.g., mRNA) or DNA. For example, an oligonucleotide(e.g., DNA oligonucleotide) that hybridizes to 12q23-qter mRNA can beused to target the mRNA for RnaseH digestion. Alternatively, anoligonucleotide that hybridizes to the translation initiation site of12q23-qter mRNA can be used to prevent translation of the mRNA. Inanother approach, oligonucleotides that bind to the double-stranded DNAof 12q23-qter can be administered. Such oligonucleotides can form atriplex construct and inhibit the transcription of the DNA encoding12q23-qter polypeptides. Triple helix pairing prevents the double helixfrom opening sufficiently to allow the binding of polymerases,transcription factors, or regulatory molecules. Recent therapeuticadvances using triplex DNA have been described (see, e.g., J. E. Gee etal., 1994, Molecular and Immunologic Approaches, Futura Publishing Co.,Mt. Kisco, N.Y.).

As non-limiting examples, antisense oligonucleotides may be targeted tohybridize to the following regions: mRNA cap region; translationinitiation site; translational termination site; transcriptioninitiation site; transcription termination site; polyadenylation signal;3′ untranslated region; 5′ untranslated region; 5′ coding region; midcoding region; and 3′ coding region. Preferably, the complementaryoligonucleotide is designed to hybridize to the most unique 5′ sequenceof a 12q23-qter gene, including any of about 15-35 nucleotides spanningthe 5′ coding sequence. Appropriate oligonucleotides can be designedusing OLIGO software (Molecular Biology Insights, Inc.; Cascade, Colo.).

In accordance with the present invention, an antisense oligonucleotidecan be synthesized, formulated as a pharmaceutical composition, andadministered to a subject. The synthesis and utilization of antisenseand triplex oligonucleotides have been previously described (e.g., H.Simon et al., 1999, Antisense Nucleic Acid Drug Dev. 9:527-31; F. X.Barre et al., 2000, Proc. Natl. Acad. Sci. USA 97:3084-3088; R. Elez etal., 2000, Biochem. Biophys. Res. Commun. 269:352-6; E. R. Sauter etal., 2000, Clin. Cancer Res. 6:654-60). Alternatively, expressionvectors derived from retroviruses, adenovirus, herpes or vacciniaviruses, or from various bacterial plasmids may be used for delivery ofnucleotide sequences to the targeted organ, tissue or cell population.Methods which are well known to those skilled in the art can be used toconstruct recombinant vectors which will express nucleic acid sequencethat is complementary to the nucleic acid sequence encoding a 12q23-qterpolypeptide. These techniques are described both in Sambrook et al.,1989 and in Ausubel et al., 1992. For example, 12q23-qter expression canbe inhibited by transforming a cell or tissue with an expression vectorthat expresses high levels of untranslatable 12q23-qter sense orantisense sequences. Even in the absence of integration into the DNA,such vectors may continue to transcribe RNA molecules until they aredisabled by endogenous nucleases. Transient expression may last for amonth or more with a non-replicating vector, and even longer ifappropriate replication elements included in the vector system.

Various assays may be used to test the ability of antisenseoligonucleotides to inhibit 12q23-qter gene expression. For example,12q23-qter mRNA levels can be assessed Northern blot analysis (Sambrooket al., 1989; Ausubel et al., 1992; J. C. Alwine et al. 1977, Proc.Natl. Acad. Sci. USA 74:5350-5354; I. M. Bird, 1998, Methods Mol. Biol.105:325-36), quantitative or semi-quantitative RT-PCR analysis (see,e.g., W. M. Freeman et al., 1999, Biotechniques 26:112-122; Ren et al.,1998, Mol. Brain Res. 59:256-63; J. M. Cale et al., 1998, Methods Mol.Biol. 105:351-71), or in situ hybridization (reviewed by A. K. Raap,1998, Mutat. Res. 400:287-298). Alternatively, 12q23-qter polypeptidelevels can be measured, e.g., by western blot analysis, indirectimmunofluorescence, immunoprecipitation techniques (see, e.g., J. M.Walker, 1998, Protein Protocols on CD-ROM, Humana Press, Totowa, N.J.).

Polypeptides

The invention also relates to 12q23-qter proteins or polypeptidesencoded by the nucleic acids described herein, e.g., SEQ ID NO:93 to SEQID NO:155, or portions or variants thereof. The proteins andpolypeptides of this invention can be isolated and/or recombinant. In apreferred embodiment, the proteins or portions thereof have at least onefunction characteristic of a chromosome 12q23-qter protein orpolypeptide. For example, Gene 702, a metalloprotease-like gene, theproduct of which is involved in inflammatory responses including, butnot limited to tissue destruction and repair. These proteins arereferred to as analogs, and the genes encoding them include, forexample, naturally occurring chromosome 12q23-qter genes, variants(e.g., mutants) encoding those proteins and/or portions thereof. Suchprotein or polypeptide variants include mutants differing by theaddition, deletion or substitution of one or more amino acid residues,or modified polypeptides in which one or more residues are modified(e.g., by phosphorylation, sulfation, acylation, etc.), and mutantscomprising one or more modified residues. The variant can have“conservative” changes, wherein a substituted amino acid has similarstructural or chemical properties, e.g., replacement of leucine withisoleucine. More infrequently, a variant can have “nonconservative”changes, e.g., replacement of a glycine with a tryptophan. Guidance indetermining which amino acid residues can be substituted, inserted, ordeleted without abolishing biological or immunological activity can bedetermined using computer programs well known in the art, for example,DNASTAR software (DNASTAR, Inc., Madison, Wis.).

As non-limiting examples, conservative substitutions in a 12q23-qteramino acid sequence can be made in accordance with the following table:

Original Conservative Residue Substitution(s) Ala Ser Arg Lys Asn Gln,His Asp Glu Cys Ser Gln Asn Glu Asp Gly Pro His Asn, Gln Ile Leu, ValLeu Ile, Val Lys Arg, Gln, Glu Met Leu, Ile Phe Met, Leu, Tyr Ser ThrThr Ser Trp Tyr Tyr Trp, Phe Val Ile, Leu

Substantial changes in function or immunogenicity can be made byselecting substitutions that are less conservative than those shown inthe table, above. For example, non-conservative substitutions can bemade which more significantly affect the structure of the polypeptide inthe area of the alteration, for example, the alpha-helical, orbeta-sheet structure; the charge or hydrophobicity of the molecule atthe target site; or the bulk of the side chain. The substitutions whichgenerally are expected to produce the greatest changes in thepolypeptide's properties are those where 1) a hydrophilic residue, e.g.,seryl or threonyl, is substituted for (or by) a hydrophobic residue,e.g., leucyl, isoleucyl, phenylalanyl, valyl, or alanyl; 2) a cysteineor proline is substituted for (or by) any other residue; 3) a residuehaving an electropositive side chain, e.g., lysyl, arginyl, or histidyl,is substituted for (or by) an electronegative residue, e.g., glutamyl oraspartyl; or 4) a residue having a bulky side chain, e.g.,phenylalanine, is substituted for (or by) a residue that does not have aside chain, e.g., glycine.

In one embodiment, the percent amino acid sequence identity between achromosome 12q23-qter polypeptide such as SEQ ID NO:93 to SEQ ID NO:155,and functional equivalents thereof is at least 50%. In a preferredembodiment, the percent amino acid sequence identity between such achromosome 12q23-qter polypeptide and its functional equivalents is atleast 65%. More preferably, the percent amino acid sequence identitybetween a chromosome 12q23-qter polypeptide and its functionalequivalents is at least 75%, still more preferably, at least 80%, andeven more preferably, at least 90%.

Percent sequence identity can be calculated using computer programs ordirect sequence comparison. Preferred computer program methods todetermine identity between two sequences include, but are not limitedto, the GCG program package, FASTA, BLASTP, and TBLASTN (see, e.g., D.W. Mount, 2001, Bioinformatics: Sequence and Genome Analysis, ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). The BLASTPand TBLASTN programs are publicly available from NCBI and other sources.The well-known Smith Waterman algorithm may also be used to determineidentity.

Exemplary parameters for amino acid sequence comparison include thefollowing: 1) algorithm from Needleman and Wunsch, 1970, J Mol. Biol.48:443-453; 2) BLOSSUM62 comparison matrix from Hentikoff and Hentikoff,1992, Proc. Natl. Acad. Sci. USA 89:10915-10919; 3) gap penalty=12; and4) gap length penalty=4. A program useful with these parameters ispublicly available as the “gap” program (Genetics Computer Group,Madison, Wis.). The aforementioned parameters are the default parametersfor polypeptide comparisons (with no penalty for end gaps).

Alternatively, polypeptide sequence identity can be calculated using thefollowing equation: % identity=(the number of identicalresidues)/(alignment length in amino acid residues)*100. For thiscalculation, alignment length includes internal gaps but does notinclude terminal gaps.

In accordance with the present invention, polypeptide sequences may beidentical to the sequence of any one of SEQ ID NO:93 to SEQ ID NO:155,or may include up to a certain integer number of amino acid alterations.Polypeptide alterations are selected from the group consisting of atleast one amino acid deletion, substitution, including conservative andnon-conservative substitution, or insertion. Alterations may occur atthe amino- or carboxy-terminal positions of the reference polypeptidesequence or anywhere between those terminal positions, interspersedeither individually among the amino acids in the reference sequence orin one or more contiguous groups within the reference sequence.

In specific embodiments, a polypeptide variant may be encoded by a12q23-qter nucleic acid comprising a SNP and/or an alternate splicevariant. For example, a polypeptide variant may be encoded by a12q23-qter alternate splice variant comprising a nucleotide sequence ofany one of SEQ ID NO:1 to SEQ ID NO:5; SEQ ID NO:17 to SEQ ID NO:18; SEQID NO:36 to SEQ ID NO:37; SEQ ID NO:43 to SEQ ID NO:44; SEQ ID NO:80 toSEQ ID NO:81, or any of the alternate splice sequences set forth inTable 4. In addition, a polypeptide variant may be encoded by a nucleicacid containing one or more 12q23-qter SNPs as set forth in Table 10;FIGS. 7A-7H; FIGS. 9A-9F; FIGS. 27A-27K; and FIGS. 28A-28C. Specificexamples of amino acid changes encoded by 12q23-qter SNPs are providedin Table 10, and are described in detail hereinbelow.

The invention also relates to isolated, synthesized and/or recombinantportions or fragments of a 12q23-qter protein or polypeptide asdescribed herein. Polypeptide fragments (i.e., peptides) can be madewhich have full or partial function on their own, or which when mixedtogether (though fully, partially, or nonfunctional alone),spontaneously assemble with one or more other polypeptides toreconstitute a functional protein having at least one functionalcharacteristic of a 12q23-qter protein of this invention. In addition,12q23-qter polypeptide fragments may comprise, for example, one or moredomains of the 12q23-qter polypeptide, disclosed herein. In particular,a Gene 454 polypeptide may comprise one or more transmembrane,extracellular, or intracellular domains; a Gene 561 polypeptide maycomprise a SH3 domain and/or one or more fibronectin type III repeats;and a Gene 757 polypeptide may comprise a cysteine rich domain, aSer/Thr-XXX-Val motif, and/or one or more transmembrane repeats (seebelow).

Polypeptides according to the invention can comprise at least 5contiguous amino acid residues; preferably the polypeptides comprise atleast 12 contiguous residues; more preferably the polypeptides compriseat least 20 contiguous residues; and yet more preferably thepolypeptides comprise at least 30 contiguous residues. Nucleic acidscomprising protein-coding sequences can be used to direct the expressionof asthma-associated polypeptides in intact cells or in cell-freetranslation systems. The coding sequence can be tailored, if desired,for more efficient expression in a given host organism, and can be usedto synthesize oligonucleotides encoding the desired amino acidsequences. The resulting oligonucleotides can be inserted into anappropriate vector and expressed in a compatible host organism ortranslation system.

The polypeptides of the present invention, includingfunction-conservative variants, may be isolated from wild-type or mutantcells (e.g., human cells or cell lines), from heterologous organisms orcells (e.g., bacteria, yeast, insect, plant, and mammalian cells), orfrom cell-free translation systems (e.g., wheat germ, microsomalmembrane, or bacterial extracts) in which a protein-coding sequence hasbeen introduced and expressed. Furthermore, the polypeptides may be partof recombinant fusion proteins. The polypeptides can also,advantageously, be made by synthetic chemistry. Polypeptides may bechemically synthesized by commercially available automated procedures,including, without limitation, exclusive solid phase synthesis, partialsolid phase methods, fragment condensation or classical solutionsynthesis.

Methods for polypeptide purification are well-known in the art,including, without limitation, preparative disc-gel electrophoresis,isoelectric focusing, HPLC, reversed-phase HPLC, gel filtration, ionexchange and partition chromatography, and countercurrent distribution.For some purposes, it is preferable to produce the polypeptide in arecombinant system in which the protein contains an additional sequence(e.g., epitope or protein) tag that facilitates purification.Non-limiting examples of epitope tags include c-myc, haemagglutinin(HA), polyhistidine (6×-HIS) (SEQ ID NO: 6160), GLU-GLU, and DYKDDDDK(SEQ ID NO: 4688) (FLAG®) epitope tags. Non-limiting examples of proteintags include glutathione-S-transferase (GST), green fluorescent protein(GFP), and maltose binding protein (MBP).

In one approach, the coding sequence of a polypeptide or peptide can becloned into a vector that creates a fusion with a sequence tag ofinterest. Suitable vectors include, without limitation, pRSET(Invitrogen Corp., San Diego, Calif.), pGEX (Amersham-Pharmacia Biotech,Inc., Piscataway, N.J.), pEGFP (CLONTECH Laboratories, Inc., Palo Alto,Calif.), and pMAL™ (New England BioLabs (NEB), Inc., Beverly, Mass.)plasmids. Following expression, the epitope, or protein taggedpolypeptide or peptide can be purified from a crude lysate of thetranslation system or host cell by chromatography on an appropriatesolid-phase matrix. In some cases, it may be preferable to remove theepitope or protein tag (i.e., via protease cleavage) followingpurification. As an alternative approach, antibodies produced against adisorder-associated protein or against peptides derived therefrom can beused as purification reagents. Other purification methods are alsopossible.

The present invention also encompasses modifications of 12q23-qterpolypeptides. The isolated polypeptides may be modified by, for example,phosphorylation, sulfation, acylation, or other protein modifications.They may also be modified with a label capable of providing a detectablesignal, either directly or indirectly, including, but not limited to,radioisotopes and fluorescent compounds, as described in detail herein.

Both the naturally occurring and recombinant forms of the polypeptidesof the invention can advantageously be used to screen compounds forbinding activity. Many methods of screening for binding activity areknown by those skilled in the art and may be used to practice theinvention. Several methods of automated assays have been developed inrecent years so as to permit screening of tens of thousands of compoundsin a short period of time. Such high-throughput screening methods areparticularly preferred. The use of high-throughput screening assays totest for inhibitors is greatly facilitated by the availability of largeamounts of purified polypeptides, as provided by the invention. Thepolypeptides of the invention also find use as therapeutic agents aswell as antigenic components to prepare antibodies.

The polypeptides of this invention find use as immunogenic componentsuseful as antigens for preparing antibodies by standard methods. It iswell known in the art that immunogenic epitopes generally contain atleast 5 contiguous amino acid residues (Ohno et al., 1985, Proc. Natl.Acad. Sci. USA 82:2945). Therefore, the immunogenic components of thisinvention will typically comprise at least 5 contiguous amino acidresidues of the sequence of the complete polypeptide chains. Preferably,they will contain at least 7, and most preferably at least 10 contiguousamino acid residues or more to ensure that they will be immunogenic.Whether a given component is immunogenic can readily be determined byroutine experimentation Such immunogenic components can be produced byproteolytic cleavage of larger polypeptides or by chemical synthesis orrecombinant technology and are thus not limited by proteolytic cleavagesites. The present invention thus encompasses antibodies thatspecifically recognize asthma-associated immunogenic components.

Structural Studies

A purified 12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155),or portions or complexes thereof, can be analyzed by well-establishedmethods (e.g., X-ray crystallography, NMR, CD, etc.) to determine thethree-dimensional structure of the molecule. The three-dimensionalstructure, in turn, can be used to model intermolecular interactions.Exemplary methods for crystallization and X-ray crystallography arefound in P. G. Jones, 1981, Chemistry in Britain, 17:222-225; C. Joneset al. (eds), Crystallographic Methods and Protocols, Humana Press,Totowa, N.J.; A. McPherson, 1982, Preparation and Analysis of ProteinCrystals, John Wiley & Sons, New York, N.Y.; T. L. Blundell and L. N.Johnson, 1976, Protein Crystallography, Academic Press, Inc., New York,N.Y.; A. Holden and P. Singer, 1960, Crystals and Crystal Growing,Anchor Books-Doubleday, New York, N.Y.; R. A. Laudise, 1970, The Growthof Single Crystals, Solid State Physical Electronics Series, N.Holonyak, Jr., (ed), Prentice-Hall, Inc.; G. H. Stout and L. H. Jensen,1989, X-ray Structure Determination: A Practical Guide, 2nd edition,John Wiliey & Sons, New York, N.Y.; Fundamentals of AnalyticalChemistry, 3rd. edition, Saunders Golden Sunburst Series, Holt, Rinehartand Winston, Philadelphia, Pa., 1976; P. D. Boyle of the Department ofChemistry of North Carolina State University; M. B. Berry, 1995, ProteinCrystalization: Theory and Practice, Structure and Dynamics of E. coliAdenylate Kinase, Doctoral Thesis, Rice University, Houston Tex.

For X-ray diffraction studies, single crystals can be grown to suitablesize. Preferably, a crystal has a size of 0.2 to 0.4 mm in at least twoof the three dimensions. Crystals can be formed in a solution comprisinga 12q23-qter polypeptide (e.g., 1.5-200 mg/ml) and reagents that reducethe solubility to conditions close to spontaneous precipitation. Factorsthat affect the formation of polypeptide crystals include: 1) purity; 2)substrates or co-factors; 3) pH; 4) temperature; 5) polypeptideconcentration; and 6) characteristics of the precipitant. Preferably,the 12q23-qter polypeptides are pure, i.e., free from contaminatingcomponents (at least 95% pure), and free from denatured 12q23-qterpolypeptides. In particular, polypeptides can be purified by FPLC andHPLC techniques to assure homogeneity (see, Lin et al., 1992, J.Crystal. Growth. 122:242-245). Optionally, 12q23-qter polypeptidesubstrates or co-factors can be added to stabilize the quaternarystructure of the protein and promote lattice packing.

Suitable precipitants for crystallization include, but are not limitedto, salts (e.g., ammonium sulphate, potassium phosphate); polymers(e.g., polyethylene glycol (PEG) 6000); alcohols (e.g., ethanol);polyalcohols (e.g., 1-methyl-2,4 pentane diol (MPD)); organic solvents;sulfonic dyes; and deionized water. The ability of a salt to precipitatepolypeptides can be generally described by the Hofmeister series: PO₄³⁻>HPO₄ ²⁻=SO₄ ²⁻>citrate>CH₃CO₂ ⁻>Cl⁻>Br⁻>NO₃ ⁻>ClO₄ ⁻>SCN⁻; and NH₄⁺>K⁺>Na⁺>Li⁺. Non-limiting examples of salt precipitants are shown below(see Berry, 1995).

Precipitant Maximum concentration (NH₄ ⁺/Na⁺/Li⁺)₂ or Mg₂ + SO₄ ²⁻4.0/1.5/2.1/2.5M NH₄ ⁺/Na⁺/K⁺ PO₄ ³⁻ 3.0/4.0/4.0M NH₄ ⁺/K⁺/Na⁺/Li⁺citrate ~1.8M NH₄ ⁺/K⁺/Na⁺/Li⁺ acetate ~3.0M NH₄ ⁺/K⁺/Na⁺/Li⁺ Cl⁻5.2/9.8/4.2/5.4M NH₄ ⁺NO₃ ⁻ ~8.0M

High molecular weight polymers useful as precipitating agents includepolyethylene glycol (PEG), dextran, polyvinyl alcohol, and polyvinylpyrrolidone (A. Polson et al., 1964, Biochem. Biophys. Acta.82:463-475). In general, polyethylene glycol (PEG) is the most effectivefor forming crystals. PEG compounds with molecular weights less than1000 can be used at concentrations above 40% v/v. PEGs with molecularweights above 1000 can be used at concentration 5-50% w/v. Typically,PEG solutions are mixed with ˜0.1% sodium azide to prevent bacterialgrowth.

Typically, crystallization requires the addition of buffers and aspecific salt content to maintain the proper pH and ionic strength for aprotein's stability. Suitable additives include, but are not limited tosodium chloride (e.g., 50-500 mM as additive to PEG and MPD; 0.15-2 M asadditive to PEG); potassium chloride (e.g., 0.05-2 M); lithium chloride(e.g., 0.05-2 M); sodium fluoride (e.g., 20-300 mM); ammonium sulfate(e.g., 20-300 mM); lithium sulfate (e.g., 0.05-2 M); sodium or ammoniumthiocyanate (e.g., 50-500 mM); MPD (e.g., 0.5-50%); 1,6 hexane diol(e.g., 0.5-10%); 1,2,3 heptane triol (e.g., 0.5-15%); and benzamidine(e.g., 0.5-15%).

Detergents may be used to maintain protein solubility and preventaggregation. Suitable detergents include, but are not limited tonon-ionic detergents such as sugar derivatives, oligoethyleneglycolderivatives, dimethylamine-N-oxides, cholate derivatives, N-octylhydroxyalkylsulphoxides, sulphobetains, and lipid-like detergents.Sugar-derived detergents include alkyl glucopyranosides (e.g., C8-GP,C9-GP), alkyl thio-glucopyranosides (e.g., C8-tGP), alkylmaltopyranosides (e.g., C10-M, C12-M; CYMAL-3, CYMAL-5, CYMAL-6), alkylthio-maltopyranosides, alkyl galactopyranosides, alkyl sucroses (e.g.,N-octanoylsucrose), and glucamides (e.g., HECAMEG, C-HEGA-10; MEGA-8).Oligoethyleneglycol-derived detergents include alkyl polyoxyethylenes(e.g., C8-E5, C8-En; C12-E8; C12-E9) and phenyl polyoxyethylenes (e.g.,Triton X-100). Dimethylamine-N-oxide detergents include, e.g., C10-DAO;DDAO; LDAO. Cholate-derived detergents include, e.g., Deoxy-Big CHAP,digitonin. Lipid-like detergents include phosphocholine compounds.Suitable detergents further include zwitter-ionic detergents (e.g.,ZWITTERGENT 3-10; ZWITTERGENT 3-12); and ionic detergents (e.g., SDS).

Crystallization of macromolecules has been performed at temperaturesranging from 60° C. to less than 0° C. However, most molecules can becrystallized at 4° C. or 22° C. Lower temperatures promote stabilizationof polypeptides and inhibit bacterial growth. In general, polypeptidesare more soluble in salt solutions at lower temperatures (e.g., 4° C.),but less soluble in PEG and MPD solutions at lower temperatures. Toallow crystallization at 4° C. or 22° C., the precipitant or proteinconcentration can be increased or decreased as required. Heating,melting, and cooling of crystals or aggregates can be used to enlargecrystals. In addition, crystallization at both 4° C. and 22° C. can beassessed (A. McPherson, 1992, J. Cryst. Growth. 122:161-167; C. W.Carter, Jr. and C. W. Carter, 1979, J. Biol. Chem. 254:12219-12223; T.Bergfors, 1993, Crystalization Lab Manual).

A crystallization protocol can be adapted to a particular polypeptide orpeptide. In particular, the physical and chemical properties of thepolypeptide can be considered (e.g., aggregation, stability, adherenceto membranes or tubing, internal disulfide linkages, surface cysteines,chelating ions, etc.). For initial experiments, the standard set ofcrystallization reagents can be used (Hampton Research; Laguna Niguel,Calif.). In addition, the CRYSTOOL program can provide guidance indetermining optimal crystallization conditions (Brent Segelke, 1995,Efficiency analysis of sampling protocols used in proteincrystallization screening and crystal structure from two novel crystalforms of PLA2, Ph.D. Thesis, University of California, San Diego).Exemplary crystallization conditions are shown below (see Berry, 1995).

Major Concentration of Concentration Precipitant Additive MajorPrecipitant of Additive (NH₄)₂SO₄ PEG 400-2000, 2.0-4.0M 6%-0.5% MPD,ethanol, or methanol Na citrate PEG 400-2000, 1.4-1.8M 6%-0.5% MPD,ethanol, or methanol PEG 1000- (NH4)₂SO₄, NaCl, 40-50% 0.2-0.6M 20000 orNa formate

Robots can be used for automatic screening and optimization ofcrystallization conditions. For example, the IMPAX and Oryx systems canbe used (Douglas Instruments, Ltd., East Garston, United Kingdom). TheCRYSTOOL program (Segelke, supra) can be integrated with the roboticsprogramming. In addition, the Xact program can be used to construct,maintain, and record the results of various crystallization experiments(see, e.g., D. E. Brodersen et al., 1999, J. Appl. Cryst. 32: 1012-1016;G. R. Andersen and J. Nyborg, 1996, J. Appl. Cryst. 29:236-240). TheXact program supports multiple users and organizes the results ofcrystallization experiments into hierarchies. Advantageously, Xact iscompatible with both CRYSTOOL and Microsoft® Excel programs.

Four methods are commonly employed to crystallize macromolecules: vapordiffusion, free interface diffusion, batch, and dialysis. The vapordiffusion technique is typically performed by formulating a 1:1 mixtureof a solution comprising the polypeptide of interest and a solutioncontaining the precipitant at the final concentration that is to beachieved after vapor equilibration. The drop containing the 1:1 mixtureof protein and precipitant is then suspended and sealed over the wellsolution, which contains the precipitant at the target concentration, aseither a hanging or sitting drop. Vapor diffusion can be used to screena large number of crystallization conditions or when small amounts ofpolypeptide are available. For screening, drop sizes of 1 to 2 μl can beused. Once preliminary crystallization conditions have been determined,drop sizes such as 10 μl can be used. Notably, results from hangingdrops may be improved with agarose gels (see K. Provost and M.-C.Robert, 1991, J. Cryst. Growth. 110:258-264). Free interface diffusionis performed by layering of a low density solution onto one of higherdensity, usually in the form of concentrated protein onto concentratedsalt. Since the solute to be crystallized must be concentrated, thismethod typically requires relatively large amounts of protein. However,the method can be adapted to work with small amounts of protein. In arepresentative experiment, 2 to 5 μl of sample is pipetted into one endof a 20 μl microcapillary pipet. Next, 2 to 5 μl of precipitant ispipetted into the capillary without introducing an air bubble, and theends of the pipet are sealed. With sufficient amounts of protein, thismethod can be used to obtain relatively large crystals (see, e.g., S. M.Althoff et al., 1988, J. Mol. Biol. 199:665-666).

The batch technique is performed by mixing concentrated polypeptide withconcentrated precipitant to produce a final concentration that issupersaturated for the solute macromolecule. Notably, this method canemploy relatively large amounts of solution (e.g., milliliterquantities), and can produce large crystals. For that reason, the batchtechnique is not recommended for screening initial crystallizationconditions.

The dialysis technique is performed by diffusing precipitant moleculesthrough a semipermeable membrane to slowly increase the concentration ofthe solute inside the membrane. Dialysis tubing can be used to dialyzemilliliter quantities of sample, whereas dialysis buttons can be used todialyze microliter quantities (e.g., 7-200 μl). Dialysis buttons may beconstructed out of glass, perspex, or Teflon™ (see, e.g., CambridgeRepetition Engineers Ltd., Greens Road, Cambridge CB4 3EQ, UK; HamptonResearch). Using this method, the precipitating solution can be variedby moving the entire dialysis button or sack into a different solution.In this way, polypeptides can be “reused” until the correct conditionsfor crystallization are found (see, e.g., C. W. Carter, Jr. et al.,1988, J. Cryst. Growth. 90:60-73). However, this method is notrecommended for precipitants comprising concentrated PEG solutions.

Various strategies have been designed to screen crystallizationconditions, including 1) pI screening; 2) grid screening; 3) factorials;4) solubility assays; 5) perturbation; and 6) sparse matrices. Inaccordance with the pI screening method, the pI of a polypeptide ispresumed to be its crystallization point. Screening at the pI can beperformed by dialysis against low concentrations of buffer (less than 20mM) at the appropriate pH, or by use of conventional precipitants.

The grid screening method can be performed on two-dimensional matrices.Typically, the precipitant concentration is plotted against pH. Theoptimal conditions can be determined for each axis, and then combined.At that point, additional factors can be tested (e.g., temperature,additives). This method works best with fast-forming crystals, and canbe readily automated (see M. J. Cox and P. C. Weber, 1988, J. Cryst.Growth. 90:318-324). Grid screens are commercially available for popularprecipitants such as ammonium sulphate, PEG 6000, MPD, PEG/LiCl, andNaCl (see, e.g., Hamilton Research).

The incomplete factorial method can be performed by 1) selecting a setof ˜20 conditions; 2) randomly assigning combinations of theseconditions; 3) grading the success of the results of each experimentusing an objective scale; and 4) statistically evaluating the effects ofeach of the conditions on crystal formation (see, e.g., C. W. Carter,Jr. et al., 1988, J. Cryst. Growth. 90:60-73). In particular, conditionssuch as pH, temperature, precipitating agent, and cations can be tested.Dialysis buttons are preferably used with this method. Typically,optimal conditions/combinations can be determined within 35 tests.Similar approaches, such as “footprinting” conditions, may also beemployed (see, e.g., E. A. Stura et al., 1991, J. Cryst. Growth.110:1-2).

The perturbation approach can be performed by altering crystallizationconditions by introducing a series of additives designed to test theeffects of altering the structure of bulk solvent and the solventdielectric on crystal formation (see, e.g., Whitaker et al., 1995,Biochem. 34:8221-8226). Additives for increasing the solvent dialectricinclude, but are not limited to, NaCl, KCl, or LiCl (e.g., 200 mM); Naformate (e.g., 200 mM); Na₂HPO₄ or K₂HPO₄ (e.g., 200 mM); urea,triachloroacetate, guanidium HCl, or KSCN (e.g., 20-50 mM). Anon-limiting list of additives for decreasing the solvent dialectricinclude methanol, ethanol, isopropanol, or tert-butanol (e.g., 1-5%);MPD (e.g., 1%); PEG 400, PEG 600, or PEG 1000 (e.g., 1-4%); PEG MME(monomethylether) 550, PEG MME 750, PEG MME 2000 (e.g., 1-4%).

As an alternative to the above-screening methods, the sparse matrixapproach can be used (see, e.g., J. Jancarik and S.-H. J. Kim, 1991,Appl. Cryst. 24:409-411; A. McPherson, 1992, J. Cryst. Growth.122:161-167; B. Cudney et al., 1994, Acta. Cryst. D50:414-423). Sparsematrix screens are commercially available (see, e.g., Hampton Research;Molecular Dimensions, Inc., Apopka, Fla.; Emerald Biostructures, Inc.,Lemont, Ill.). Notably, data from Hampton Research sparse matrix screenscan be stored and analyzed using ASPRUN software (Douglas Instruments).

Exemplary conditions for an initial screen are shown below (see Berry,1995).

TABLE 1A CRYSTALIZATION CONDITIONS Tray 1: PEG 8000 (wells 1-6) Ammoniumsulfate (wells 7-12) 1 2 3 4 5 6 7 8 9 10 11 12 20% 20% 20% 35% 35% 35%2.0M 2.0M 2.0M 2.5M 2.5M 2.5M pH 5.0 pH 7.0 pH 8.6 pH 5.0 pH 7.0 pH 8.6pH 5.0 pH 7.0 pH 8.8 pH 5.0 pH 7.0 pH 8.8 MPD (wells 13-16) Na Citrate(wells 17-20) Na/K Phosphate (wells 21-24) 13 14 15 16 17 18 19 20 21 2223 24 30% 30% 50% 50% 1.3M 1.3M 1.5M 1.5M 2.0M 2.0M 2.5M 2.5M pH 5.8 pH7.6 pH 5.8 pH 7.6 pH 5.8 pH 7.5 pH 5.8 pH 7.5 pH 6.0 pH 7.4 pH 6.0 pH7.4 Tray 2: PEG 2000 MME/0.2M Ammon. sulfate (wells 25-30) 25 26 27 2829 30 25% 25% 25% 40% 40% 40% pH 5.5 pH 7.0 pH 8.5 pH 5.5 pH 7.0 pH 8.5Random for wells 31 to 48

The initial screen can be used with hanging or sitting drops. Toconserve the sample, tray 2 can be set up several weeks followingtray 1. Wells 31-48 of tray 2 can comprise a random set of solutions.Alternatively, solutions can be formulated using sparse methods.Preferably, test solutions cover a broad range of precipitants,additives, and pH (especially pH 5.0-9.0).

Seeding can be used to trigger nucleation and crystal growth (Stura andWilson, 1990, J. Cryst. Growth. 110:270-282; C. Thaller et al., 1981, J.Mol. Biol. 147:465-469; A. McPherson and P. Schlichta, 1988, J. Cryst.Growth. 90:47-50). In general, seeding can performed by transferringcrystal seeds into a polypeptide solution to allow polypeptide moleculesto deposit on the surface of the seeds and produce crystals. Two seedingmethods can be used: microseeding and macroseeding. For microseeding, acrystal can be ground into tiny pieces and transferred into the proteinsolution. Alternatively, seeds can be transferred by adding 1-2 μl ofthe seed solution directly to the equilibrated protein solution. Inanother approach, seeds can be transferred by dipping a hair in the seedsolution and then streaking the hair across the surface of the drop(streak seeding; see Stura and Wilson, supra). For macroseeding, anintact crystal can be transferred into the protein solution (see, e.g.,C. Thaller et al., 1981, J. Mol. Biol. 147:465-469). Preferably, thesurface of the crystal seed is washed to regenerate the growing surfaceprior to being transferred. Optimally, the protein solution forcrystallization is close to saturation and the crystal seed is notcompletely dissolved upon transfer.

Antibodies

Another aspect of the invention pertains to antibodies directed to12q23-qter polypeptides, or portions or variants thereof. The inventionprovides polyclonal and monoclonal antibodies that bind 12q23-qterpolypeptides or peptides. The antibodies may be elicited in an animalhost (e.g., rabbit, goat, mouse, or other non-human mammal) byimmunization with disorder-associated immunogenic components. Antibodiesmay also be elicited by in vitro immunization (sensitization) of immunecells. The immunogenic components used to elicit the production ofantibodies may be isolated from cells or chemically synthesized. Theantibodies may also be produced in recombinant systems programmed withappropriate antibody-encoding DNA. Alternatively, the antibodies may beconstructed by biochemical reconstitution of purified heavy and lightchains. The antibodies include hybrid antibodies, chimeric antibodies,and univalent antibodies. Also included are Fab fragments, includingFab¹ and Fab(ab)² fragments of antibodies.

In accordance with the present invention, antibodies are directed to a12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155), orvariants, or portions thereof. For example, antibodies can be producedto bind to a 12q23-qter polypeptide encoded by an alternate splicevariant comprising a nucleotide sequence of any one of SEQ ID NO:1 toSEQ ID NO:5; SEQ ID NO:17 to SEQ ID NO:18; SEQ ID NO:36 to SEQ ID NO:37;SEQ ID NO:43 to SEQ ID NO:44; SEQ ID NO:80 to SEQ ID NO:81; or any ofthe alternate splice sequences set forth in Table 4. As another example,antibodies can be produced to bind to a 12q23-qter polypeptide variantencoded by a nucleic acid containing one or more 12q23-qter SNPs as setforth in Table 10; FIGS. 7A-7H; FIGS. 9A-9F; FIGS. 27A-27K; and FIGS.28A-28C. Such antibodies can be used as diagnostic and/or therapeuticreagents.

An isolated 12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ IDNO:155), or variant, or portion thereof, can be used as an immunogen togenerate antibodies using standard techniques for polyclonal andmonoclonal antibody preparation. A full-length 12q23-qter polypeptidecan be used or, alternatively, the invention provides antigenic peptideportions of 12q23-qter for use as immunogens. The antigenic peptide of12q23-qter comprises at least 5 contiguous amino acid residues of theamino acid sequence shown in any one of SEQ ID NO:93 to SEQ ID NO:155,or a variant thereof, and encompasses an epitope of a 12q23-qterpolypeptide such that an antibody raised against the peptide forms aspecific immune complex with A 12q23-qter amino acid sequence.

An appropriate immunogenic preparation can contain, for example,recombinantly produced 12q23-qter polypeptide or a chemicallysynthesized 12q23-qter polypeptide, or portions thereof. The preparationcan further include an adjuvant, such as Freund's complete or incompleteadjuvant, or similar immunostimulatory agent. A number of adjuvants areknown and used by those skilled in the art. Non-limiting examples ofsuitable adjuvants include incomplete Freund's adjuvant, mineral gelssuch as alum, aluminum phosphate, aluminum hydroxide, aluminum silica,and surface-active substances such as lysolecithin, pluronic polyols,polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, anddinitrophenol. Further examples of adjuvants includeN-acetyl-muramyl-L-threonyl-D-isoglutamine (thr-MDP),N-acetyl-nor-muramyl-L-alanyl-D-isoglutamine (CGP 11637, referred to asnor-MDP),N-acetylmuramyl-Lalanyl-D-isoglutaminyl-L-alanine-2-(1′-2′-dipalmitoyl-sn-glycero-3hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE),and RIBI, which contains three components extracted from bacteria,monophosphoryl lipid A, trehalose dimycolate and cell wall skeleton(MPL+TDM+CWS) in a 2% squalene/Tween 80 emulsion. A particularly usefuladjuvant comprises 5% (wt/vol) squalene, 2.5% Pluronic L121 polymer and0.2% polysorbate in phosphate buffered saline (Kwak et al., 1992, NewEng. J. Med. 327:1209-1215). Preferred adjuvants include complete BCG,Detox, (RIBI, Immunochem Research Inc.), ISCOMS, and aluminum hydroxideadjuvant (Superphos, Biosector). The effectiveness of an adjuvant may bedetermined by measuring the amount of antibodies directed against theimmunogenic peptide.

Polyclonal antibodies to 12q23-qter polypeptides can be prepared asdescribed above by immunizing a suitable subject with a 12q23-qterimmunogen. The antibody titer in the immunized subject can be monitoredover time by standard techniques, such as with an enzyme linkedimmunosorbent assay (ELISA) using immobilized 12q23-qter polypeptide orpeptide. If desired, the antibody molecules can be isolated from themammal (e.g., from the blood) and further purified by well-knowntechniques, such as protein A chromatography to obtain the IgG fraction.

At an appropriate time after immunization, e.g., when the antibodytiters are highest, antibody-producing cells can be obtained from thesubject and used to prepare monoclonal antibodies by standardtechniques, such as the hybridoma technique (see Kohler and Milstein,1975, Nature 256:495-497; Brown et al., 1981, J. Immunol. 127:539-46;Brown et al., 1980, J. Biol. Chem. 255:4980-83; Yeh et al., 1976, PNAS76:2927-31; and Yeh et al., 1982, Int. J. Cancer 29:269-75), the human Bcell hybridoma technique (Kozbor et al., 1983, Immunol. Today 4:72), theEBV-hybridoma technique (Cole et al., 1985, Monoclonal Antibodies andCancer Therapy, Alan R. Liss, Inc., pp. 77-96) or trioma techniques.

The technology for producing hybridomas is well-known (see generally R.H. Kenneth, 1980, Monoclonal Antibodies: A New Dimension In BiologicalAnalyses, Plenum Publishing Corp., New York, N.Y.; E. A. Lerner, 1981,Yale J. Biol. Med., 54:387-402; M. L. Gefter et al., 1977, Somatic CellGenet. 3:231-36). In general, an immortal cell line (typically amyeloma) is fused to lymphocytes (typically splenocytes) from a mammalimmunized with a 12q23-qter immunogen as described above, and theculture supernatants of the resulting hybridoma cells are screened toidentify a hybridoma producing a monoclonal antibody that binds12q23-qter polypeptides or peptides.

Any of the many well known protocols used for fusing lymphocytes andimmortalized cell lines can be applied for the purpose of generating anmonoclonal antibody to a 12q23-qter polypeptide (see, e.g., G. Galfre etal., 1977, Nature 266:55052; Gefter et al., 1977; Lerner, 1981; Kenneth,1980). Moreover, the ordinarily skilled worker will appreciate thatthere are many variations of such methods. Typically, the immortal cellline (e.g., a myeloma cell line) is derived from the same mammalianspecies as the lymphocytes. For example, murine hybridomas can be madeby fusing lymphocytes from a mouse immunized with an immunogenicpreparation of the present invention with an immortalized mouse cellline. Preferred immortal cell lines are mouse myeloma cell lines thatare sensitive to culture medium containing hypoxanthine, aminopterin,and thymidine (HAT medium). Any of a number of myeloma cell lines can beused as a fusion partner according to standard techniques, e.g., theP3-NS1/1-Ag4-1, P3-x63-Ag8.653, or Sp2/O-Ag14 myeloma lines. Thesemyeloma lines are available from ATCC (American Type Culture Collection,Manassas, Va.). Typically, HAT-sensitive mouse myeloma cells are fusedto mouse splenocytes using polyethylene glycol (PEG). Hybridoma cellsresulting from the fusion arc then selected using HAT medium, whichkills unfused and unproductively fused myeloma cells (unfusedsplenocytes die after several days because they are not transformed).Hybridoma cells producing a monoclonal antibody of the invention aredetected by screening the hybridoma culture supernatants for antibodiesthat bind 12q23-qter polypeptides or peptides, e.g., using a standardELISA assay.

Alternative to preparing monoclonal antibody-secreting hybridomas, amonoclonal antibody can be identified and isolated by screening arecombinant combinatorial immunoglobulin library (e.g., an antibodyphage display library) with the corresponding 12q23-qter polypeptide tothereby isolate immunoglobulin library members that bind thepolypeptide. Kits for generating and screening phage display librariesare commercially available (e.g., the Pharmacia Recombinant PhageAntibody System, Catalog No. 27-9400-01; and the Stratagene SurfZAP™Phage Display Kit, Catalog No. 240612).

Additionally, examples of methods and reagents particularly amenable foruse in generating and screening antibody display library can be foundin, for example, Ladner et al. U.S. Pat. No. 5,223,409; Kang et al. PCTInternational Publication No. WO 92/18619; Dower et al. PCTInternational Publication No. WO 91/17271; Winter et al. PCTInternational Publication WO 92/20791; Markland et al. PCT InternationalPublication No. WO 92/15679; Breitling et al. PCT InternationalPublication WO 93/01288; McCafferty et al. PCT International PublicationNo. WO 92/01047; Garrard et al. PCT International Publication No. WO92/09690; Ladner et al. PCT International Publication No. WO 90/02809;Fuchs et al., 1991, Bio/Technology 9:1370-1372; Hay et al., 1992, Hum.Antibod. Hybridomas 3:81-85; Huse et al., 1989, Science 246:1275-1281;Griffiths et al., 1993, EMBO J 12:725-734; Hawkins et al., 1992, J. Mol.Biol. 226:889-896; Clarkson et al., 1991, Nature 352:624-628; Gram etal., 1992, PNAS 89:3576-3580; Garrad et al., 1991, Bio/Technology9:1373-1377; Hoogenboom et al., 1991, Nuc. Acid Res. 19:4133-4137;Barbas et al., 1991, PNAS 88:7978-7982; and McCafferty et al., 1990,Nature 348:552-55.

Additionally, recombinant antibodies to a 12q23-qter polypeptide, suchas chimeric and humanized monoclonal antibodies, comprising both humanand non-human portions, can be made using standard recombinant DNAtechniques. Such chimeric and humanized monoclonal antibodies can beproduced by recombinant DNA techniques known in the art, for exampleusing methods described in Robinson et al. International Application No.PCT/US86/02269; Akira, et al. European Patent Application 184,187;Taniguchi, M., European Patent Application 171,496; Morrison et al.European Patent Application 173,494; Neuberger et al. PCT InternationalPublication No. WO 86/01533; Cabilly et al. U.S. Pat. No. 4,816,567;Cabilly et al. European Patent Application 125,023; Better et al., 1988,Science 240:1041-1043; Liu et al., 1987, PNAS 84:3439-3443; Liu et al.,1987, J. Immunol. 139:3521-3526; Sun et al., 1987, PNAS 84:214-218;Nishimura et al., 1987, Canc. Res. 47:999-1005; Wood et al., 1985,Nature 314:446-449; and Shaw et al., 1988, J. Natl. Cancer Inst.80:1553-1559; S. L. Morrison, 1985, Science 229:1202-1207; Oi et al.,1986, BioTechniques 4:214; Winter U.S. Pat. No. 5,225,539; Jones et al.,1986, Nature 321:552-525; Verhoeyan et al., 1988, Science 239:1534; andBcidler et al., 1988, J. Immunol. 141:4053-4060.

An antibody against a 12q23-qter polypeptide (e.g., monoclonal antibody)can be used to isolate the corresponding polypeptide by standardtechniques, such as affinity chromatography or immunoprecipitation. Forexample, antibodies can facilitate the purification of a natural12q23-qter polypeptide from cells and of a recombinantly produced12q23-qter polypeptide or peptide expressed in host cells. In addition,an antibody that binds to a 12q23-qter polypeptide can be used to detectthe corresponding protein (e.g., in a cellular lysate or cellsupernatant) in order to evaluate the abundance and pattern ofexpression of the protein. Such antibodies can also be useddiagnostically to monitor 12q23-qter protein levels in tissue as part ofa clinical testing procedure, e.g., to, for example, determine theefficacy of a given treatment regimen as described in detail herein. Inaddition, antibodies to a 12q23-qter polypeptide can be used astherapeutics for the treatment of diseases related to abnormal12q23-qter gene expression or function, e.g., asthma.

Ligands

The 12q23-qter polypeptides (e.g., SEQ ID NO:93 to SEQ ID NO:155),polynucleotides (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 toSEQ ID NO:4687), variants, or fragments or portions thereof, can be usedto screen for ligands (e.g., agonists, antagonists, or inhibitors) thatmodulate the levels or activity of the 12q23-qter polypeptide. Inaddition, these 12q23-qter molecules can be used to identify endogenousligands that bind to 12q23-qter polypeptides or polynucleotides in thecell. In one aspect of the present invention, the full-length 12q23-qterpolypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155) is used to identifyligands. Alternatively, variants or portions of a 12q23-qter polypeptideare used. Such portions may comprise, for example, one or more domainsof the 12q23-qter polypeptide (e.g., transmembrane, intracellular,extracellular, SH3, fibronectin III repeat, cysteine-rich, andSer/Thr-XXX-Val domains) disclosed herein. Of particular interest arescreening assays that identify agents that have relatively low levels oftoxicity in human cells. A wide variety of assays may be used for thispurpose, including in vitro protein-protein binding assays,electrophoretic mobility shift assays, immunoassays, and the like.

Ligands that bind to the 12q23-qter polypeptides or polynucleotides ofthe invention are potentially useful in diagnostic applications and/orpharmaceutical compositions, as described in detail herein. Ligands mayencompass numerous chemical classes, though typically they are organicmolecules, e.g., small molecules. Preferably, small molecules have amolecular weight of less than 5000 daltons, more preferably, smallmolecules have a molecular weight of more than 50 and less than 2,500daltons. Such molecules can comprise functional groups necessary forstructural interaction with proteins, particularly hydrogen bonding, andtypically include at least an amine, carbonyl, hydroxyl or carboxylgroup, preferably at least two of the functional chemical groups. Usefulmolecules often comprise cyclical carbon or heterocyclic structuresand/or aromatic or polyaromatic structures substituted with one or moreof the above functional groups. Such molecules can also comprisebiomolecules including peptides, saccharides, fatty acids, steroids,purines, pyrimidines, derivatives, structural analogs, or combinationsthereof.

Ligands may include, for example, 1) peptides such as soluble peptides,including Ig-tailed fusion peptides and members of random peptidelibraries (see, e.g., Lam et al., 1991, Nature 354:82-84; Houghten etal., 1991, Nature 354:84-86) and combinatorial chemistry-derivedmolecular libraries made of D- and/or L-configuration amino acids; 2)phosphopeptides (e.g., members of random and partially degenerate,directed phosphopeptide libraries, see, e.g., Songyang et al, 1993, Cell72:767-778); 3) antibodies (e.g., polyclonal, monoclonal, humanized,anti-idiotypic, chimeric, and single chain antibodies as well as Fab,F(ab′)₂, Fab expression library fragments, and epitope-binding fragmentsof antibodies); and 4) small organic and inorganic molecules.

Test agents useful for identifying 12q23-qter ligands can be obtainedfrom a wide variety of sources including libraries of synthetic ornatural compounds. Synthetic compound libraries are commerciallyavailable from, for example, Maybridge Chemical Co. (Trevillet,Cornwall, UK), Comgenex (Princeton, N.J.), Brandon Associates(Merrimack, N.H.), and Microsource (New Milford, Conn.). A rare chemicallibrary is available from Aldrich Chemical Company, Inc. (Milwaukee,Wis.). Natural compound libraries comprising bacterial, fungal, plant oranimal extracts are available from, for example, Pan Laboratories(Bothell, Wash.). In addition, numerous means are available for randomand directed synthesis of a wide variety of organic compounds andbiomolecules, including expression of randomized oligonucleotides.

Alternatively, libraries of natural compounds in the form of bacterial,fungal, plant and animal extracts can be readily produced. Methods forthe synthesis of molecular libraries are readily available (see, e.g.,DeWitt et al., 1993, Proc. Natl. Acad. Sci. USA 90:6909; Erb et al.,1994, Proc. Natl. Acad. Sci. USA 91:11422; Zuckermann et al., 1994, J.Med. Chem. 37:2678; Cho et al., 1993, Science 261:1303; Carell et al.,1994, Angew. Chem. Int. Ed. Engl. 33:2059; Carell et al., 1994, Angew.Chem. Int. Ed. Engl. 33:2061; and in Gallop et al., 1994, J. Med. Chem.37:1233). In addition, natural or synthetic compound libraries andcompounds can be readily modified through conventional chemical,physical and biochemical means (see, e.g., Blondelle et al., 1996,Trends in Biotech. 14:60), and may be used to produce combinatoriallibraries. In another approach, previously identified pharmacologicalagents can be subjected to directed or random chemical modifications,such as acylation, alkylation, esterification, amidification, and theanalogs can be screened for 12q23-qter gene-modulating activity.

Numerous methods for producing combinatorial libraries are known in theart, including those involving biological libraries; spatiallyaddressable parallel solid phase or solution phase libraries; syntheticlibrary methods requiring deconvolution; the ‘one-bead one-compound’library method; and synthetic library methods using affinitychromatography selection. The biological library approach is limited topolypeptide libraries, while the other four approaches are applicable topolypeptide, non-peptide oligomer, or small molecule libraries ofcompounds (K. S. Lam, 1997, Anticancer Drug Des. 12:145).

Non-limiting examples of small molecules, small molecule libraries,combinatorial libraries, and screening methods are described in B.Seligmann, 1995, “Synthesis, Screening, Identification of PositiveCompounds and Optimization of Leads from Combinatorial Libraries:Validation of Success” p. 69-70. Symposium: Exploiting MolecularDiversity: Small Molecule Libraries for Drug Discovery, La Jolla,Calif., Jan. 23-25, 1995 (conference summary available from Wendy Warr &Associates, 6 Berwick Court, Cheshire, UK CW4 7HZ); E. Martin et al.,1995, J. Med. Chem. 38:1431-1436; E. Martin et al., 1995, “Measuringdiversity: Experimental design of combinatorial libraries for drugdiscovery” Abstract, ACS Meeting, Anaheim, Calif., COMP 32; and E.Martin, 1995, “Measuring Chemical Diversity: Random Screening orRationale Library Design” p. 27-30, Symposium: Exploiting MolecularDiversity: Small Molecule Libraries for Drug Discovery, La Jolla, Calif.Jan. 23-25, 1995 (conference summary available from Wendy Warr &Associates, 6 Berwick Court, Cheshire, UK CW4 7HZ).

Libraries may be screened in solution (e.g., Houghten, 1992,Biotechniques 13:412-421), or on beads (Lam, 1991, Nature 354:82-84),chips (Fodor, 1993, Nature 364:555-556), bacteria or spores (Ladner U.S.Pat. No. 5,223,409), plasmids (Cull et al., 1992, Proc. Natl. Acad. Sci.USA 89:1865-1869), or on phage (Scott and Smith, 1990, Science249:386-390; Devlin, 1990, Science 249:404-406; Cwirla et al., 1990,Proc. Natl. Acad. Sci. USA 97:6378-6382; Felici, 1991, J. Mol. Biol.222:301-310; Ladner, supra).

Where the screening assay is a binding assay, a 12q23-qter polypeptide,polynucleotide, analog, or fragment thereof, may be joined to a label,where the label can directly or indirectly provide a detectable signal.Various labels include radioisotopes, fluorescers, chemiluminescers,enzymes, specific binding molecules, particles, e.g., magneticparticles, and the like. Specific binding molecules include pairs, suchas biotin and streptavidin, digoxin and antidigoxin, etc. For thespecific binding members, the complementary member would normally belabeled with a molecule that provides for detection, in accordance withknown procedures.

A variety of other reagents may be included in the screening assay.These include reagents like salts, neutral proteins, e.g., albumin,detergents, etc., that are used to facilitate optimal protein-proteinbinding and/or reduce non-specific or background interactions. Reagentsthat improve the efficiency of the assay, such as protease inhibitors,nuclease inhibitors, anti-microbial agents, etc., may be used. Thecomponents are added in any order that produces the requisite binding.Incubations are performed at any temperature that facilitates optimalactivity, typically between 4° and 40° C. Incubation periods areselected for optimum activity, but may also be optimized to facilitaterapid high-throughput screening. Normally, between 0.1 and 1 hr will besufficient. In general, a plurality of assay mixtures is run in parallelwith different agent concentrations to obtain a differential response tothese concentrations. Typically, one of these concentrations serves as anegative control, i.e., at zero concentration or below the level ofdetection.

To perform cell-free ligand screening assays, it may be desirable toimmobilize either a 12q23-qter polypeptide, polynucleotide, or fragmentto a surface to facilitate identification of ligands that bind to thesemolecules, as well as to accommodate automation of the assay. Forexample, a fusion protein comprising a 12q23-qter polypeptide and anaffinity tag can be produced. In one embodiment, aglutathione-S-transferase/phosphodiesterase fusion protein comprising a12q23-qter polypeptide is adsorbed onto glutathione sepharose beads(Sigma Chemical, St. Louis, Mo.) or glutathione-derivatized microtiterplates. Cell lysates (e.g., containing ³⁵S-labeled polypeptides) areadded to the coated beads under conditions to allow complex formation(e.g., at physiological conditions for salt and pH). Followingincubation, the coated beads are washed to remove any unboundpolypeptides, and the amount of immobilized radiolabel is determined.Alternatively, the complex is dissociated and the radiolabel present inthe supernatant is determined. In another approach, the beads areanalyzed by SDS-PAGE to identify the bound polypeptides.

Ligand-binding assays can be used to identify agonist or antagoniststhat alter the function or levels of a 12q23-qter polypeptide. Suchassays are designed to detect the interaction of test agents (e.g.,small molecules) with 12q23-qter polypeptides, polynucleotides, analogs,or fragments or portions thereof. Interactions may be detected by directmeasurement of binding. Alternatively, interactions may be detected byindirect indicators of binding, such as stabilization/destabilization ofprotein structure, or activation/inhibition of biological function.Non-limiting examples of useful ligand-binding assays are detailedbelow.

Ligands that bind to 12q23-qter polypeptides, polynucleotides, analogs,or fragments or portions thereof, can be identified using real-timeBimolecular Interaction Analysis (BIA; Sjolander et al., 1991, Anal.Chem. 63:2338-2345; Szabo et al., 1995, Curr. Opin. Struct. Biol.5:699-705). BIA-based technology (e.g., BIAcore™; LKB Pharmacia, Sweden)allows study of biospecific interactions in real time, without labeling.In BIA, changes in the optical phenomenon surface plasmon resonance(SPR) is used determine real-time interactions of biological molecules.

Ligands can also be identified by scintillation proximity assays (SPA,described in U.S. Pat. No. 4,568,649). In a modification of this assaythat is currently undergoing development, chaperonins are used todistinguish folded and unfolded proteins. A tagged protein is attachedto SPA beads, and test agents are added. The bead is then subjected tomild denaturing conditions (such as, e.g., heat, exposure to SDS, etc.)and a purified labeled chaperonin is added. If a test agent binds to atarget, the labeled chaperonin will not bind; conversely, if no testagent binds, the protein will undergo some degree of denaturation andthe chaperonin will bind.

Ligands can also be identified using a binding assay based onmitochondrial targeting signals (Hurt et al., 1985, EMBO J. 4:2061-2068;Eilers and Schatz, 1986, Nature 322:228-231). In a mitochondrial importassay, expression vectors are constructed in which nucleic acidsencoding particular target proteins are inserted downstream of sequencesencoding mitochondrial import signals. The chimeric proteins aresynthesized and tested for their ability to be imported into isolatedmitochondria in the absence and presence of test compounds. A testcompound that binds to the target protein should inhibit its uptake intoisolated mitochondria in vitro.

The ligand-binding assay described in Fodor et al., 1991, Science251:767-773, which involves testing the binding affinity of testcompounds for a plurality of defined polymers synthesized on a solidsubstrate, can also be used.

Ligands that bind to 12q23-qter polypeptides or peptides can beidentified using two-hybrid assays (see, e.g., U.S. Pat. No. 5,283,317;Zervos et al., 1993, Cell 72:223-232; Madura et al., 1993, J. Biol.Chem. 268:12046-12054; Bartel et al., 1993, Biotechniques 14:920-924;Iwabuchi et al., 1993, Oncogene 8:1693-1696; and Brent WO 94/10300). Thetwo-hybrid system relies on the reconstitution of transcriptionactivation activity by association of the DNA-binding and transcriptionactivation domains of a transcriptional activator throughprotein-protein interaction. The yeast GAL4 transcriptional activatormay be used in this way, although other transcription factors have beenused and are well known in the art. To carryout the two-hybrid assay,the GAL4 DNA-binding domain, and the GAL4 transcription activationdomain are expressed, separately, as fusions to potential interactingpolypeptides.

In one embodiment, the “bait” protein comprises a 12q23-qter polypeptidefused to the GAL4 DNA-binding domain. The “fish” protein comprises, forexample, a human cDNA library encoded polypeptide fused to the GAL4transcription activation domain. If the two, coexpressed fusion proteinsinteract in the nucleus of a host cell, a reporter gene (e.g., LacZ) isactivated to produce a detectable phenotype. The host cells that showtwo-hybrid interactions can be used to isolate the containing plasmidscontaining the cDNA library sequences. These plasmids can be analyzed todetermine the nucleic acid sequence and predicted polypeptide sequenceof the candidate ligand. Alternatively, methods such as the three-hybrid(Licitra et al., 1996, Proc. Natl. Acad. Sci. USA 93:12817-12821), andreverse two-hybrid (Vidal et al., 1996, Proc. Natl. Acad. Sci. USA93:10315-10320) systems may be used. Commercially available two-hybridsystems such as the CLONTECH Matchmaker™ systems and protocols (CLONTECHLaboratories, Inc., Palo Alto, Calif.) may be also be used (see also, A.R. Mendelsohn et al., 1994, Curr. Op. Biotech. 5:482; E. M. Phizicky etal., 1995, Microbiological Rev. 59:94; M. Yang et al., 1995, NucleicAcids Res. 23:1152; S. Fields et al., 1994, Trends Genet. 10:286; andU.S. Pat. Nos. 6,283,173 and 5,468,614).

Several methods of automated assays have been developed in recent yearsso as to permit screening of tens of thousands of test agents in a shortperiod of time. High-throughput screening methods are particularlypreferred for use with the present invention. The ligand-binding assaysdescribed herein can be adapted for high-throughput screens, oralternative screens may be employed. For example, continuous format highthroughput screens (CF-HTS) using at least one porous matrix allows theresearcher to test large numbers of test agents for a wide range ofbiological or biochemical activity (see U.S. Pat. No. 5,976,813 toBeutel et al.). Moreover, CF-HTS can be used to perform multi-stepassays.

Diagnostics

As discussed herein, 12q23-qter genes are associated with variousdiseases and disorders, including but not limited to, asthma, atopy,obesity, male germ cell tumors, histidinemia, growth retardation withdeafness and mental retardation, deficiency of Acyl-CoA dehydrogenase,spinal muscular atrophy, Darier disease, cardiomyopathy, Spinocerebellarataxia-2, brachydactyly, Mevalonicaciduria, Hyperimmunoglobulinemia D,Noonan syndrome-1, Cardiofaciocutaneous syndrome, spinal muscularatrophy-4, tyrosinemia, phenylketonuria, B-cell non-Hodgkin lymphoma,Ulnar-mammary syndrome, Holt-Oram syndrome, Scapuloperoneal spinalmuscular atrophy, alcohol intolerance, MODY, diabetes mellitus,non-insulin-dependent type 2, diabetes mellitus insulin-dependent (SeeNational Center for Biotechnology Information; Bethesda, Md.), andinflammatory bowel disease (B. Wallaert et al., 1995, J. Exp. Med.182:1897-1904). The present invention therefore provides nucleic acidsand antibodies that can be useful in diagnosing individuals withdisorders associated with aberrant 12q23-qter gene expression and/ormutated 12q23-qter genes. In particular, nucleic acids comprising12q23-qter SNPs can be used to identify chromosomal abnormalities linkedto these diseases. Additionally, antibodies directed against the aminoacid variants encoded by the 12q23-qter SNPs can be used to identifydisease-associated polypeptides.

Antibody-Based Diagnostic Methods:

In a further embodiment of the present invention, antibodies whichspecifically bind to a 12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQID NO:155) may be used for the diagnosis of conditions or diseasescharacterized by underexpression or overexpression of the 12q23-qterpolynucleotide or polypeptide, or in assays to monitor patients beingtreated with a 12q23-qter polypeptide, polynucleotide, or antibody, or a12q23-qter agonist, antagonist, or inhibitor.

The antibodies useful for diagnostic purposes may be prepared in thesame manner as those for use in therapeutic methods, described herein.Antibodies may be raised to a full-length 12q23-qter polypeptidesequence (e.g., SEQ ID NO:93 to SEQ ID NO:155). Alternatively, theantibodies may be raised to portions or variants of the 12q23-qterpolypeptide. Such variants include polypeptides encoded by the disclosed12q23-qter SNPs or alternate splice variants. In one aspect of theinvention, antibodies are prepared to bind to a 12q23-qter polypeptidefragment comprising one or more domains of the 12q23-qter polypeptide(e.g., transmembrane, intracellular, extracellular, SH3, fibronectin IIIrepeat, cysteine-rich, and Ser/Thr-XXX-Val domains), as described indetail herein.

Diagnostic assays for a 12q23-qter polypeptide include methods thatutilize the antibody and a label to detect the protein in biologicalsamples (e.g., human body fluids, cells, tissues, or extracts of cellsor tissues). The antibodies may be used with or without modification,and may be labeled by joining them, either covalently or non-covalently,with a reporter molecule. A wide variety of reporter molecules that areknown in the art may be used, several of which are described herein.

The invention provides methods for detecting disease-associatedantigenic components in a biological sample, which methods comprise thesteps of: 1) contacting a sample suspected to contain adisease-associated antigenic component with an antibody specific for andisease-associated antigen, extracellular or intracellular, underconditions in which an antigen-antibody complex can form between theantibody and disease-associated antigenic components in the sample; and2) detecting any antigen-antibody complex formed in step (1) using anysuitable means known in the art, wherein the detection of a complexindicates the presence of disease-associated antigenic components in thesample. It will be understood that assays that utilize antibodiesdirected against altered 12q23-qter amino acid sequences (i.e., epitopesencoded by SNPs, modifications, mutations, or variants) are within thescope of the invention.

Many immunoassay formats are known in the art, and the particular formatused is determined by the desired application. An immunoassay can use,for example, a monoclonal antibody directed against a singledisease-associated epitope, a combination of monoclonal antibodiesdirected against different epitopes of a single disease-associatedantigenic component, monoclonal antibodies directed towards epitopes ofdifferent disease-associated antigens, polyclonal antibodies directedtowards the same disease-associated antigen, or polyclonal antibodiesdirected towards different disease-associated antigens. Protocols canalso, for example, use solid supports, or may involveimmunoprecipitation.

In accordance with the present invention, “competitive” (U.S. Pat. Nos.3,654,090 and 3,850,752), “sandwich” (U.S. Pat. No. 4,016,043), and“double antibody,” or “DASP” assays may be used. Several procedures formeasuring the amount of a 12q23-qter polypeptide in a sample (e.g.,ELISA, RIA, and FACS) are known in the art and provide a basis fordiagnosing altered or abnormal levels of 12q23-qter polypeptideexpression. Normal or standard values for a 12q23-qter polypeptideexpression are established by incubating biological samples taken fromnormal subjects, preferably human, with antibody to a 12q23-qterpolypeptide under conditions suitable for complex formation. The amountof standard complex formation may be quantified by various methods;photometric means are preferred. Levels of the 12q23-qter polypeptideexpressed in the subject sample, negative control (normal) sample, andpositive control (disease) sample are compared with the standard values.Deviation between standard and subject values establishes the parametersfor diagnosing disease.

Typically, immunoassays use either a labeled antibody or a labeledantigenic component (i.e., to compete with the antigen in the sample forbinding to the antibody). A number of fluorescent materials are knownand can be utilized as labels for antibodies or polypeptides. Theseinclude, for example, Cy3, Cy5, GFP (e.g., EGFP, DsRed, dEFP, etc.(CLONTECH, Palo Alto, Calif.)), Alexa, BODIPY, fluorescein (e.g.,FluorX, DTAF, and FITC), rhodamine (e.g., TRITC), auramine, Texas Red,AMCA blue, and Lucifer Yellow. Antibodies or polypeptides can also belabeled with a radioactive element or with an enzyme. Preferred isotopesinclude ³H, ¹⁴C, 32P, ³⁵S, ³⁶Cl, ⁵¹Cr, ⁵⁷Co, ⁵⁸Co, ⁵⁹Fe, ⁹⁰Y, ¹²⁵I,¹³¹I, and ¹⁸⁶Re.

Preferred enzymes include peroxidase, β-glucuronidase, β-D-glucosidase,β-D-galactosidase, urease, glucose oxidase plus peroxidase, and alkalinephosphatase (see, e.g., U.S. Pat. Nos. 3,654,090; 3,850,752 and4,016,043). Enzymes can be conjugated by reaction with bridgingmolecules such as carbodiimides, diisocyanates, glutaraldehyde, and thelike. Enzyme labels can be detected visually, or measured bycalorimetric, spectrophotometric, fluorospectrophotometric,amperometric, or gasometric techniques. Other labeling systems, such asavidin/biotin, Tyramide Signal Amplification (TSA™), are known in theart, and are commercially available (see, e.g., ABC kit, VectorLaboratories, Inc., Burlingame, Calif.; NEN® Life Science Products,Inc., Boston, Mass.).

Kits suitable for antibody-based diagnostic applications typicallyinclude one or more of the following components:

(1) Antibodies: The antibodies may be pre-labeled; alternatively, theantibody may be unlabeled and the ingredients for labeling may beincluded in the kit in separate containers, or a secondary, labeledantibody is provided; and

(2) Reaction components: The kit may also contain other suitablypackaged reagents and materials needed for the particular immunoassayprotocol, including solid-phase matrices, if applicable, and standards.

The kits referred to above may include instructions for conducting thetest. Furthermore, in preferred embodiments, the diagnostic kits areadaptable to high-throughput and/or automated operation.

Nucleic-Acid-Based Diagnostic Methods:

The invention provides methods for detecting altered levels or sequencesof 12q23-qter nucleic acids (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQID NO:156 to SEQ ID NO:4687) in a sample, such as in a biologicalsample, comprising the steps of: 1) contacting a sample suspected tocontain a disease-associated nucleic acid with one or moredisease-associated nucleic acid probes under conditions in which hybridscan form between any of the probes and disease-associated nucleic acidin the sample; and 2) detecting any hybrids formed in step (1) using anysuitable means known in the art, wherein the detection of hybridsindicates the presence of the disease-associated nucleic acid in thesample. Exemplary methods are described in Examples 9 and 10, hereinbelow. To detect disease-associated nucleic acids present in low levelsin biological samples, it may be necessary to amplify thedisease-associated sequences or the hybridization signal as part of thediagnostic assay. Techniques for amplification are known to those ofskill in the art.

The presence of a 12q23-qter polynucleotide sequences can be detected byDNA-DNA or DNA-RNA hybridization, or by amplification using probes orprimers comprising at least a portion of a 12q23-qter polynucleotide, ora sequence complementary thereto. In particular, nucleic acidamplification-based assays can use 12q23-qter oligonucleotides oroligomers to detect transformants containing 12q23-qter DNA or RNA.Preferably, 12q23-qter nucleic acids useful as probes in diagnosticmethods include oligonucleotides at least 15 contiguous nucleotides inlength, more preferably at least 20 contiguous nucleotides in length,and most preferably at least 25-55 contiguous nucleotides in length,that hybridize specifically with 12q23-qter nucleic acids. Asnon-limiting examples, probes or primers useful for diagnostics maycomprise any of the 12q23-qter DNA nucleotide sequences shown in Tables8, 9, 11A, and 11B.

Several methods can be used to produce specific probes for 12q23-qterpolynucleotides. For example, labeled probes can be produced byoligo-labeling, nick translation, end-labeling, or PCR amplificationusing a labeled nucleotide. Alternatively, 12q23-qter polynucleotidesequences (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ IDNO:4687), or any portions or fragments thereof, may be cloned into avector for the production of an mRNA probe. Such vectors are known inthe art, are commercially available, and may be used to synthesize RNAprobes in vitro by addition of an appropriate RNA polymerase, such asT7, T3, or SP(6) and labeled nucleotides. These procedures may beconducted using a variety of commercially available kits (e.g., fromAmersham-Pharmacia; Promega Corp.; and U.S. Biochemical Corp.,Cleveland, Ohio). Suitable reporter molecules or labels which may beused include radionucleotides, enzymes, fluorescent, chemiluminescent,or chromogenic agents, as well as substrates, cofactors, inhibitors,magnetic particles, and the like.

A sample to be analyzed, such as, for example, a tissue sample (e.g.,hair or buccal cavity) or body fluid sample (e.g., blood or saliva), maybe contacted directly with the nucleic acid probes. Alternatively, thesample may be treated to extract the nucleic acids contained therein. Itwill be understood that the particular method used to extract DNA willdepend on the nature of the biological sample. The resulting nucleicacid from the sample may be subjected to gel electrophoresis or othersize separation techniques, or, the nucleic acid sample may beimmobilized on an appropriate solid matrix without size separation.

Kits suitable for nucleic acid-based diagnostic applications typicallyinclude the following components:

(1) Probe DNA: The probe DNA may be prelabeled; alternatively, the probeDNA may be unlabeled and the ingredients for labeling may be included inthe kit in separate containers; and

(2) Hybridization reagents: The kit may also contain other suitablypackaged reagents and materials needed for the particular hybridizationprotocol, including solid-phase matrices, if applicable, and standards.

In cases where a disease condition is suspected to involve an alterationof a 12q23-qter nucleotide sequence, specific oligonucleotides may beconstructed and used to assess the level of disease mRNA in cellsaffected or other tissue affected by the disease. For example, PCR canbe used to test whether a person has a disease-related polymorphism(i.e., mutation). Specific methods of polymorphism identification aredescribed herein, but are not intended to limit the present invention.The detection of polymorphisms in DNA sequences can be accomplished by avariety of methods including, but not limited to, RFLP detection basedon allele-specific restriction-endonuclease cleavage (Kan and Dozy,1978, Lancet ii:910-912), hybridization with allele-specificoligonucleotide probes (Wallace et al., 1978, Nucl Acids Res.6:3543-3557), including immobilized oligonucleotides (Saiki et al.,1969, Proc. Natl. Acad. Sci. USA 86:6230-6234) or oligonucleotide arrays(Maskos and Southern, 1993, Nucl. Acids Res. 21:2269-2270),allele-specific PCR (Newton et al., 1989, Nucl. Acids Res.17:2503-2516), mismatch-repair detection (MRD) (Faham and Cox, 1995,Genome Res. 5:474-482), binding of MutS protein (Wagner et al., 1995,Nucl. Acids Res. 23:3944-3948), denaturing-gradient gel electrophoresis(DGGE) (Fisher and Lerman et al., 1983, Proc. Natl. Acad. Sci. USA.80:1579-1583), single-strand-conformation-polymorphism detection (Oritaet al., 1983, Genomics 5:874-879), RNAase cleavage at mismatchedbase-pairs (Myers et al., 1985, Science 230:1242), chemical (Cotton etal., 1988, Proc. Natl. Acad. Sci. USA 8:4397-4401) or enzymatic (Youilet al., 1995, Proc. Natl. Acad. Sci. USA 92:87-91) cleavage ofheteroduplex DNA, methods based on allele specific primer extension(Syvanen et al., 1990, Genomics 8:684-692), genetic bit analysis (GBA)Nikiforov et al., 1994, Nucl. Acids 22:4167-4175), theoligonucleotide-ligation assay (OLA) (Landegren et al., 1988, Science241:1077), the allele-specific ligation chain reaction (LCR) (Barrany,1991, Proc. Natl. Acad. Sci. USA 88:189-193), gap-LCR (Abravaya et al.,1995, Nucl. Acids Res. 23:675-682), radioactive and/or fluorescent DNAsequencing using standard procedures well known in the art, and peptidenucleic acid (PNA) assays (Orum et al., 1993, Nucl. Acids Res.21:5332-5356).

For PCR analysis, 12q23-qter oligonucleotides may be chemicallysynthesized, generated enzymatically, or produced from a recombinantsource. Oligomers will preferably comprise two nucleotide sequences, onewith a sense orientation (5′→3′) and another with an antisenseorientation (3′→5′), employed under optimized conditions foridentification of a specific gene or condition. The same two oligomers,nested sets of oligomers, or even a degenerate pool of oligomers may beemployed under less stringent conditions for detection and/orquantification of closely related DNA or RNA sequences.

In accordance with PCR analysis, two oligonucleotides are synthesized bystandard methods or are obtained from a commercial supplier ofcustom-made oligonucleotides. The length and base composition aredetermined by standard criteria using the Oligo 4.0 primer Pickingprogram (W. Rychlik, 1992; available from Molecular Biology Insights,Inc., Cascade, Colo.). One of the oligonucleotides is designed so thatit will hybridize only to the disease gene DNA under the PCR conditionsused. The other oligonucleotide is designed to hybridize a segment ofgenomic DNA such that amplification of DNA using these oligonucleotideprimers produces a conveniently identified DNA fragment. Samples may beobtained from hair follicles, whole blood, or the buccal cavity. The DNAfragment generated by this procedure is sequenced by standardtechniques.

In one particular aspect, 12q23-qter oligonucleotides can be used toperform Genetic Bit Analysis (GBA) of 12q23-qter genes in accordancewith published methods (T. T. Nikiforov et al., 1994, Nucleic Acids Res.22(20):4167-75; T. T. Nikiforov T T et al., 1994, PCR Methods Appl.3(5):285-91). In PCR-based GBA, specific fragments of genomic DNAcontaining the polymorphic site(s) are first amplified by PCR using oneunmodified and one phosphorothioate-modified primer. The double-strandedPCR product is rendered single-stranded and then hybridized toimmobilized oligonucleotide primer in wells of a multi-well plate. Theprimer is designed to anneal immediately adjacent to the polymorphicsite of interest. The 3′ end of the primer is extended using a mixtureof individually labeled dideoxynucleoside triphosphates. The label onthe extended base is then determined. Preferably, GBA is performed usingsemi-automated ELISA or biochip formats (see, e.g., S. R. Head et al.,1997, Nucleic Acids Res. 25(24):5065-71; T. T. Nikiforov et al., 1994,Nucleic Acids Res. 22(20):4167-75).

Other amplification techniques besides PCR may be used as alternatives,such as ligation-mediated PCR or techniques involving Q-beta replicase(Cahill et al., 1991, Clin. Chem., 37(9):1482-5). Products ofamplification can be detected by agarose gel electrophoresis,quantitative hybridization, or equivalent techniques for nucleic aciddetection known to one skilled in the art of molecular biology (Sambrooket al., 1989). Other alterations in the disease gene may be diagnosed bythe same type of amplification-detection procedures, by usingoligonucleotides designed to contain and specifically identify thosealterations.

In accordance with the present invention, 12q23-qter polynucleotides mayalso be used to detect and quantify levels of 12q23-qter mRNA inbiological samples in which altered expression of 12q23-qterpolynucleotide may be correlated with disease. These diagnostic assaysmay be used to distinguish between the absence, presence, increase, anddecrease of 12q23-qter mRNA levels, and to monitor regulation of12q23-qter polynucleotide levels during therapeutic treatment orintervention. For example, 12q23-qter polynucleotide sequences, orfragments, or complementary sequences thereof, can be used in Southernor Northern analysis, dot blot, or other membrane-based technologies; inPCR technologies; or in dip stick, pin, ELISA or biochip assaysutilizing fluids or tissues from patient biopsies to detect the statusof, e.g., levels or overexpression of 12q23-qter genes, or to detectaltered 12q23-qter gene expression. Such qualitative or quantitativemethods are well known in the art (G. H. Keller and M. M. Manak, 1993,DNA Probes, 2^(nd) Ed, Macmillan Publishers Ltd., England; D. W.Dieffenbach and G. S. Dveksler, 1995, PCR Primer: A Laboratory Manual,Cold Spring Harbor Press, Plainview, N.Y.; B. D. Hames and S. J.Higgins, 1985, Gene Probes 1, 2, IRL Press at Oxford University Press,Oxford, England).

Methods suitable for quantifying the expression of 12q23-qter genesinclude radiolabeling or biotinylating nucleotides, co-amplification ofa control nucleic acid, and standard curves onto which the experimentalresults are interpolated (P. C. Melby et al., 1993, J. Immunol. Methods159:235-244; and C. Duplaa et al., 1993, Anal. Biochem. 212(1):229-36.).The speed of quantifying multiple samples may be accelerated by runningthe assay in an ELISA format where the oligomer of interest is presentedin various dilutions and a spectrophotometric or colorimetric responsegives rapid quantification.

In accordance with these methods, the specificity of the probe, i.e.,whether it is made from a highly specific region (e.g., at least 8 to 10or 12 or 15 contiguous nucleotides in the 5′ regulatory region), or aless specific region (e.g., especially in the 3′ coding region), and thestringency of the hybridization or amplification (e.g., high, moderate,or low) will determine whether the probe identifies naturally occurringsequences encoding the 12q23-qter polypeptide, or alleles, SNPs,mutants, or related sequences.

In a particular aspect, a 12q23-qter nucleic acid sequence (e.g., SEQ IDNO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687), or a sequencecomplementary thereto, or fragment thereof, may be useful in assays thatdetect 12q23-qter-related diseases such as asthma. A 12q23-qterpolynucleotide can be labeled by standard methods, and added to abiological sample from a subject under conditions suitable for theformation of hybridization complexes. After a suitable incubationperiod, the sample can be washed and the signal is quantified andcompared with a standard value. If the amount of signal in the testsample is significantly altered from that of a comparable negativecontrol (normal) sample, the altered levels of a 12q23-qter nucleotidesequence can be correlated with the presence of the associated disease.Such assays may also be used to evaluate the efficacy of a particularprophylactic or therapeutic regimen in animal studies, in clinicaltrials, or for an individual patient.

To provide a basis for the diagnosis of a disease associated withaltered expression of a 12q23-qter gene, a normal or standard profilefor expression is established. This may be accomplished by incubatingbiological samples taken from normal subjects, either animal or human,with a sequence complementary to the 12q23-qter polynucleotide, or afragment thereof, under conditions suitable for hybridization oramplification. Standard hybridization may be quantified by comparing thevalues obtained from normal subjects with those from an experiment wherea known amount of a substantially purified polynucleotide is used.Standard values obtained from normal samples may be compared with valuesobtained from samples from patients who are symptomatic for the disease.Deviation between standard and subject (patient) values is used toestablish the presence of the condition.

Once the disease is diagnosed and a treatment protocol is initiated,hybridization assays may be repeated on a regular basis to evaluatewhether the level of expression in the patient begins to approximatethat which is observed in a normal individual. The results obtained fromsuccessive assays may be used to show the efficacy of treatment over aperiod ranging from several days to months.

With respect to diseases such as asthma, the presence of an abnormalamount of a 12q23-qter transcript in a biological sample (e.g., bodyfluid, cells, tissues, or cell or tissue extracts) from an individualmay indicate a predisposition for the development of the disease, or mayprovide a means for detecting the disease prior to the appearance ofactual clinical symptoms. A more definitive diagnosis of this type mayallow health professionals to employ preventative measures or aggressivetreatment earlier, thereby preventing the development or furtherprogression of the disease.

Microarrays:

In another embodiment of the present invention, oligonucleotides, orlonger fragments derived from a 12q23-qter polynucleotide sequencedescribed herein may be used as targets in a microarray (e.g., biochip)system. The microarray can be used to monitor the expression level oflarge numbers of genes simultaneously (to produce a transcript image),and to identify genetic variants, mutations, and polymorphisms. Thisinformation may be used to determine gene function, to understand thegenetic basis of a disease, to diagnose disease, and to develop andmonitor the activities of therapeutic or prophylactic agents.Preparation and use of microarrays have been described in WO 95/11995 toChee et al.; D. J. Lockhart et al., 1996, Nature Biotechnology14:1675-1680; M. Schena et al., 1996, Proc. Natl. Acad. Sci. USA93:10614-10619; U.S. Pat. No. 6,015,702 to P. Lal et al; J. Worley etal., 2000, Microarray Biochip Technology, M. Schena, ed., BiotechniquesBook, Natick, Mass., pp. 65-86; Y. H. Rogers et al., 1999, Anal.Biochem. 266(1):23-30; S. R. Head et al., 1999, Mol. Cell. Probes.13(2):81-7; S. J. Watson et al., 2000, Biol. Psychiatry 48(12):1147-56.

In one application of the present invention, microarrays containingarrays of 12q23-qter polynucleotide sequences can be used to measure theexpression levels of 12q23-qter nucleic acids in an individual. Inparticular, to diagnose an individual with a 12q23-qter-relatedcondition or disease, a sample from a human or animal (containingnucleic acids, e.g., mRNA) can be used as a probe on a biochipcontaining an array of 12q23-qter polynucleotides (e.g., DNA) indecreasing concentrations (e.g., 1 ng, 0.1 ng, 0.01 ng, etc.). The testsample can be compared to samples from diseased and normal samples.Biochips can also be used to identify 12q23-qter mutations orpolymorphisms in a population, including but not limited to, deletions,insertions, and mismatches. For example, mutations can be identifiedby: 1) placing 12q23-qter polynucleotides of this invention onto abiochip; 2) taking a test sample (containing, e.g., mRNA) and adding thesample to the biochip; 3) determining if the test samples hybridize tothe 12q23-qter polynucleotides attached to the chip under varioushybridization conditions (see, e.g., V. R. Chechetkin et al., 2000, J.Biomol. Struct. Dyn. 18(1):83-101). Alternatively microarray sequencingcan be performed (see, e.g., E. P. Diamandis, 2000, Clin. Chem.46(10):1523-5).

Chromosome Mapping:

In another application of this invention, 12q23-qter nucleic acidsequences (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO: 156 to SEQID NO:4687), or complementary sequences, or fragments thereof, can beused as probes to map genomic sequences. The sequences may be mapped toa particular chromosome, to a specific region of a chromosome, or tohuman artificial chromosome constructions (HACs), yeast artificialchromosomes (YACs), bacterial artificial chromosomes (BACs), bacterialPI constructions, or single chromosome cDNA libraries (see, e.g., C. M.Price, 1993, Blood Rev., 7:127-134; B. J. Trask, 1991, Trends Genet.7:149-154).

In another of its aspects, the invention relates to a diagnostic kit fordetecting a 12q23-qter polynucleotide or polypeptide as it relates to adisease or susceptibility to a disease, particularly asthma. Alsorelated is a diagnostic kit that can be used to detect or assess asthmaconditions. Such kits comprise one or more of the following:

(a) a 12q23-qter polynucleotide, preferably the nucleotide sequence ofany one of SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ IDNO:4687, or a fragment thereof; or

(b) a nucleotide sequence complementary to that of (a); or

(c) a 12q23-qter polypeptide, preferably the polypeptide of any one ofSEQ ID NO:93 to SEQ ID NO:155, or a fragment thereof; or

(d) an antibody to a 12q23-qter polypeptide, preferably to thepolypeptide of any one of SEQ ID NO:93 to SEQ ID NO:155, or an antibodybindable fragment thereof. It will be appreciated that in any such kits,(a), (b), (c), or (d) may comprise a substantial component and thatinstructions for use can be included. The kits may also containperipheral reagents such as buffers, stabilizers, etc.

The present invention also includes a test kit for genetic screeningthat can be utilized to identify mutations in 12q23-qter genes. Byidentifying patients with mutated 12q23-qter DNA and comparing themutation to a database that contains known mutations in 12q23-qter and aparticular condition or disease, identification and/or confirmation of,a particular condition or disease can be made. Accordingly, such a kitwould comprise a PCR-based test that would involve transcribing thepatients mRNA with a specific primer, and amplifying the resulting cDNAusing another set of primers. The amplified product would be detectableby gel electrophoresis and could be compared with known standards for12q23-qter genes. Preferably, this kit would utilize a patient's blood,serum, or saliva sample, and the DNA would be extracted using standardtechniques. Primers flanking a known mutation would then be used toamplify a fragment of a 12q23-qter gene. The amplified piece would thenbe sequenced to determine the presence of a mutation.

Genomic Screening:

Polymorphic genetic markers linked to a 12q23-qter gene can be used topredict susceptibility to the diseases genetically linked to thatchromosomal region. Similarly, the identification of polymorphic geneticmarkers within 12q23-qter genes will allow the identification ofspecific allelic variants that are in linkage disequilibrium with othergenetic lesions that affect one of the disease states discussed hereinincluding respiratory disorders, obesity, and inflammatory boweldisease. SSCP (see below) allows the identification of polymorphismswithin the genomic and coding region of the disclosed genes.

The present invention provides sequences for primers that can be usedidentify exons that contain SNPs, as well as sequences for primers thatcan be used to identify the sequence changes of the SNPs. In particular,Table 10 shows polymorphic genetic markers within the chromosome12q23-qter genes, which can be used to identify specific allelicvariants that are in linkage disequilibrium with other genetic lesionsthat affect one of the disease states discussed herein, includingrespiratory disorders, obesity, and inflammatory bowel disease. Suchmarkers can be used in conjunction with SSCP to identify polymorphismswithin the genomic and coding region of the disclosed gene. Table 8shows primers that can be used to identify exons containing SNPs. Table9 shows primers that can be used to identify the sequence changes of theSNPs.

This information can be used to identify additional SNPs in accordancewith the methods disclosed herein. Suitable methods for genomicscreening have also been described by, e.g., Sheffield et al., 1995,Genet. 4:1837-1844; LeBlanc-Straceski et al., 1994, Genomics 19:341-9;Chen et al., 1995, Genomics 25:1-8. In employing these methods, thedisclosed reagents can be used to predict the risk for disease (e.g.,respiratory disorders, obesity, and inflammatory bowel disease) in apopulation or individual.

Therapeutics

As discussed herein, 12q23-qter genes are associated with variousdiseases and disorders, including but not limited to, asthma, atopy,obesity, male germ cell tumors, histidinemia, growth retardation withdeafness and mental retardation, deficiency of Acyl-CoA dehydrogenase,spinal muscular atrophy, Darier disease, cardiomyopathy, Spinocerebellarataxia-2, brachydactyly, Mevalonicaciduria, Hyperimmunoglobulinemia D,Noonan syndrome-1, Cardiofaciocutaneous syndrome, spinal muscularatrophy-4, tyrosinemia, phenylketonuria, B-cell non-Hodgkin lymphoma,Ulnar-mammary syndrome, Holt-Oram syndrome, Scapuloperoneal spinalmuscular atrophy, alcohol intolerance, MODY, diabetes mellitus,non-insulin-dependent type 2, diabetes mellitus insulin-dependent (SeeNational Center for Biotechnology Information; Bethesda, Md.), andinflammatory bowel disease (B. Wallaert et al., 1995, J. Exp. Med.182:1897-1904). The present invention therefore provides compositions(e.g., pharmaceutical compositions) comprising 12q23-qter nucleic acids,polypeptides, antibodies, ligands, or variants, portions, or fragmentsthereof that can be useful in treating individuals with these disorders.Also provided are methods employing 12q23-qter nucleic acids,polypeptides, antibodies, ligands, or variants, portions, or fragmentsthereof to identify drug candidates that can be used to prevent, treat,or ameliorate such disorders.

Drug Screening and Design:

The present invention provides methods of screening for drugs using a12q23-qter polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155), or portionthereof, in competitive binding assays, according to methods well-knownin the art. For example, competitive drug screening assays can beemployed using neutralizing antibodies capable of specifically binding a12q23-qter polypeptide compete with a test compound for binding to the12q23-qter polypeptide or fragments thereof.

The present invention further provides methods of rational drug designemploying a 12q23-qter polypeptide, antibody, or portion or functionalequivalent thereof. The goal of rational drug design is to producestructural analogs of biologically active polypeptides of interest or ofsmall molecules with which they interact (e.g., agonists, antagonists,or inhibitors). In turn, these analogs can be used to fashion drugswhich are, for example, more active or stable forms of the polypeptide,or which, e.g., enhance or interfere with the function of thepolypeptide in vivo (see, e.g., Hodgson, 1991, Bio/Technology, 9:19-21).An example of rational drug design is the development of HIV proteaseinhibitors (Erickson et al., 1990, Science, 249:527-533).

In one approach, one first determines the three-dimensional structure ofa protein of interest or, for example, of a 12q23-qter polypeptide orligand complex, by x-ray crystallography, computer modeling, or acombination thereof. Useful information regarding the structure of apolypeptide can also be gained by computer modeling based on thestructure of homologous proteins. In addition, 12q23-qter polypeptides(e.g., SEQ ID NO:93 to SEQ ID NO:155), or portions thereof, can beanalyzed by an alanine scan (Wells, 1991, Methods in Enzymol.,202:390-411). In this technique, each amino acid residue in a 12q23-qterpolypeptide is replaced by alanine, and its effect on the activity ofthe polypeptide is determined.

In another approach, an antibody specific to a 12q23-qter polypeptidecan be isolated, selected by a functional assay, and then analyzed tosolve its crystal structure. In principle, this approach can yield apharmacore upon which subsequent drug design can be based.Alternatively, it is possible to bypass protein crystallographyaltogether by generating anti-idiotypic antibodies (anti-ids) to afunctional, pharmacologically active antibody. As a mirror image of amirror image, the binding site of the anti-ids is predicted to be ananalog of the corresponding 12q23-qter polypeptide. The anti-id can thenbe used to identify and isolate peptides from banks of chemically orbiologically produced banks of peptides. Selected peptides cansubsequently be used as pharmacores.

Non-limiting examples of methods and computer tools for drug design aredescribed in R. Cramer et al., 1974, J. Med. Chem. 17:533; H. Kubinyi(ed) 1993, 3D QSAR in Drug Design, Theory, Methods, and Applications,ESCOM, Leiden, Holland; P. Dean (ed) 1995, Molecular Similarity in DrugDesign, K. Kim “Comparative molecular field analysis (ComFA)” p.291-324, Chapman & Hill, London, UK; Y. et al., 1993, J. Comp.-Aid. Mol.Des. 7:83-102; G. Lauri and P. A. Bartlett, 1994, J. Comp.-Aid. Mol.Des. 8:51-66; P. J. Gane and P. M. Dean, 2000, Curr. Opin. Struct. Biol.10(4):401-4; H. O. Kim and M. Kahn, 2000, Comb. Chem. High ThroughputScreen. 3(3):167-83; G. K. Farber, 1999, Pharmacol Ther. 84(3):327-32;and H. van de Waterbeemd (ed) 1996, Structure-Property Correlations inDrug Research, Academic Press, San Diego, Calif.

In another aspect of the present invention, cells and animals that carrya 12q23-qter gene or an analog thereof can be used as model systems tostudy and test for substances that have potential as therapeutic agents.After a test agent is administered to animals or applied to the cells,the phenotype of the animals/cells can be determined.

In accordance with these methods, one may design drugs that result in,for example, altered 12q23-qter polypeptide activity or stability. Suchdrugs may act as inhibitors, agonists, or antagonists of a 12q23-qterpolypeptide. By virtue of the availability of cloned 12q23-qter genesequences, sufficient amounts of the 12q23-qter polypeptide may beproduced to perform such analytical studies as x-ray crystallography. Inaddition, the knowledge of the 12q23-qter polypeptide sequence willguide those employing computer-modeling techniques in place of, or inaddition to x-ray crystallography.

Pharmaceutical Compositions:

The present invention contemplates compositions comprising a 12q23-qterpolynucleotide (e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 toSEQ ID NO:4687), polypeptide (e.g., SEQ ID NO:93 to SEQ ID NO:155),antibody, ligand (e.g., agonist, antagonist, or inhibitor), orfragments, variants, or analogs thereof, and a physiologicallyacceptable carrier, excipient, or diluent as described in detail herein.The present invention further contemplates pharmaceutical compositionsuseful in practicing the therapeutic methods of this invention.Preferably, a pharmaceutical composition includes, in admixture, apharmaceutically acceptable excipient (carrier) and one or more of a12q23-qter polypeptide, polynucleotide, ligand, antibody, or fragment,portion, or variant thereof, as described herein, as an activeingredient. The preparation of pharmaceutical compositions that contain12q23-qter molecules as active ingredients is well understood in theart. Typically, such compositions are prepared as injectables, either asliquid solutions or suspensions, however, solid forms suitable forsolution in, or suspension in, liquid prior to injection can also beprepared. The preparation can also be emulsified. The active therapeuticingredient is often mixed with excipients that are pharmaceuticallyacceptable and compatible with the active ingredient. Suitableexcipients are, for example, water, saline, dextrose, glycerol, ethanol,or the like and combinations thereof. In addition, if desired, thecomposition can contain minor amounts of auxiliary substances such aswetting or emulsifying agents, pH-buffering agents, which enhance theeffectiveness of the active ingredient.

A 12q23-qter polypeptide, polynucleotide, ligand, antibody, or fragment,portion, or variant thereof can be formulated into the pharmaceuticalcomposition as neutralized physiologically acceptable salt forms.Suitable salts include the acid addition salts (i.e., formed with thefree amino groups of the polypeptide or antibody molecule) and which areformed with inorganic acids such as, for example, hydrochloric orphosphoric acids, or such organic acids as acetic, oxalic, tartaric,mandelic, and the like. Salts formed from the free carboxyl groups canalso be derived from inorganic bases such as, for example, sodium,potassium, ammonium, calcium, or ferric hydroxides, and such organicbases as isopropylamine, trimethylamine, 2-ethylamino ethanol,histidine, procaine, and the like.

The pharmaceutical compositions can be administered systemically by oralor parenteral routes. Non-limiting parenteral routes of administrationinclude subcutaneous, intramuscular, intraperitoneal, intravenous,transdermal, inhalation, intranasal, intra-arterial, intrathecal,enteral, sublingual, or rectal. Intravenous administration, for example,can be performed by injection of a unit dose. The term “unit dose” whenused in reference to a pharmaceutical composition of the presentinvention refers to physically discrete units suitable as unitary dosagefor humans, each unit containing a predetermined quantity of activematerial calculated to produce the desired therapeutic effect inassociation with the required diluent; i.e., carrier, or vehicle.

In one particular embodiment of the present invention, the disclosedpharmaceutical compositions are administered via mucoactive aerosoltherapy (see, e.g., M. Fuloria and B. K. Rubin, 2000, Respir. Care45:868-873; I. Gonda, 2000, J. Pharm. Sci. 89:940-945; R. Dhand, 2000,Curr. Opin. Pulm. Med. 6(1):59-70; B. K. Rubin, 2000, Respir. Care45(6):684-94; S. Suarez and A. J. Hickey, 2000, Respir. Care.45(6):652-66).

Pharmaceutical compositions are administered in a manner compatible withthe dosage formulation, and in a therapeutically effective amount. Thequantity to be administered depends on the subject to be treated,capacity of the subject's immune system to utilize the activeingredient, and degree of modulation of 12q23-qter gene activitydesired. Precise amounts of active ingredient required to beadministered depend on the judgment of the practitioner and are specificfor each individual. However, suitable dosages may range from about 0.1to 20, preferably about 0.5 to about 10, and more preferably one toseveral, milligrams of active ingredient per kilogram body weight ofindividual per day and depend on the route of administration. Suitableregimes for initial administration and booster shots are also variable,but are typified by an initial administration followed by repeated dosesat one or more hour intervals by a subsequent injection or otheradministration. Alternatively, continuous intravenous infusionssufficient to maintain concentrations of 10 nM to 10 μM in the blood arecontemplated. An exemplary pharmaceutical formulation comprises:12q23-qter antagonist or inhibitor (5.0 mg/ml); sodium bisulfite USP(3.2 mg/ml); disodium edetate USP (0.1 mg/ml); and water for injectionq.s.a.d. (1.0 ml). As used herein, “pg” means picogram, “ng” meansnanogram, “μg” means microgram, “mg” means milligram, “μl” meansmicroliter, “ml” means milliliter, and “l” means L.

For further guidance in preparing pharmaceutical formulations, see,e.g., Gilman et al. (eds), 1990, Goodman and Gilman's: ThePharmacological Basis of Therapeutics, 8th ed., Pergamon Press; andRemington's Pharmaceutical Sciences, 17th ed., 1990, Mack PublishingCo., Easton, Pa.; Avis et al. (eds), 1993, Pharmaceutical Dosage Forms:Parenteral Medications, Dekker, New York; Lieberman et al. (eds), 1990,Pharmaceutical Dosage Forms: Disperse Systems, Dekker, New York.

In yet another aspect of this invention, antibodies that specificallyreact with a 12q23-qter polypeptide or peptides derived therefrom can beused as therapeutics. In particular, such antibodies can be used toblock the activity of a 12q23-qter polypeptide. Antibodies or fragmentsthereof can be formulated as pharmaceutical compositions andadministered to a subject. It is noted that antibody-based therapeuticsproduced from non-human sources can cause an undesired immune responsein human subjects. To minimize this problem, chimeric antibodyderivatives can be produced. Chimeric antibodies combine a non-humananimal variable region with a human constant region. Chimeric antibodiescan be constructed according to methods known in the art (see Morrisonet al., 1985, Proc. Natl. Acad. Sci. USA 81:6851; Takeda et al., 1985,Nature 314:452; U.S. Pat. No. 4,816,567 of Cabilly et al.; U.S. Pat. No.4,816,397 of Boss et al.; European Patent Publication EP 171496; EP0173494; United Kingdom Patent GB 2177096B).

In addition, antibodies can be further “humanized” by any of thetechniques known in the art, (e.g., Teng et al., 1983, Proc. Natl. Acad.Sci. USA 80:7308-7312; Kozbor et al., 1983, Immunology Today 4: 7279;Olsson et al., 1982, Meth. Enzymol. 92:3-16; International PatentApplication WO92/06193; EP 0239400). Humanized antibodies can also beobtained from commercial sources (e.g., Scotgen Limited, Middlesex,England). Immunotherapy with a humanized antibody may result inincreased long-term effectiveness for the treatment of chronic diseasesituations or situations requiring repeated antibody treatments.

Pharmacogenetics:

The 12q23-qter polynucleotides (e.g., SEQ ID NO:1 to SEQ ID NO:92 andSEQ ID NO:156 to SEQ ID NO:4687) and polypeptides (e.g., SEQ ID NO:93 toSEQ ID NO:155) of the invention are also useful in pharmacogeneticanalysis (i.e., the study of the relationship between an individual'sgenotype and that individual's response to a therapeutic composition ordrug). See, e.g., M. Eichelbaum, 1996, Clin. Exp. Pharmacol. Physiol.23(10-11):983-985, and M. W. Linder, 1997, Clin. Chem. 43(2):254-266.The genotype of the individual can determine the way a therapeutic actson the body or the way the body metabolizes the therapeutic. Further,the activity of drug metabolizing enzymes affects both the intensity andduration of therapeutic activity. Differences in the activity ormetabolism of therapeutics can lead to severe toxicity or therapeuticfailure. Accordingly, a physician or clinician may consider applyingknowledge obtained in relevant pharmacogenetic studies in determiningwhether to administer a 12q23-qter polypeptide, polynucleotide, analog,antagonist, inhibitor, or modulator, as well as tailoring the dosageand/or therapeutic or prophylactic treatment regimen.

In general, two types of pharmacogenetic conditions can bedifferentiated. Genetic conditions can be due to a single factor thatalters the way the drug act on the body (altered drug action), or afactor that alters the way the body metabolizes the drug (altered drugmetabolism). These conditions can occur either as rare genetic defectsor as naturally-occurring polymorphisms. For example,glucose-6-phosphate dehydrogenase deficiency (G6PD) is a commoninherited enzymopathy which results in haemolysis after ingestion ofoxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans)and consumption of fava beans.

The discovery of genetic polymorphisms of drug metabolizing enzymes(e.g., N-acetyltransferase 2 (NAT 2) and cytochrome P450 enzymes CYP2D6and CYP2C19) has provided an explanation as to why some patients do notobtain the expected drug effects or show exaggerated drug response andserious toxicity after taking the standard and safe dose of a drug.These polymorphisms are expressed in two phenotypes in the population,the extensive metabolizer (EM) and poor metabolizer (PM). The prevalenceof PM is different among different populations. The gene coding forCYP2D6 is highly polymorphic and several mutations have been identifiedin PM, which all lead to the absence of functional CYP2D6. Poormetabolizers quite frequently experience exaggerated drug response andside effects when they receive standard doses. If a metabolite is theactive therapeutic moiety, PM show no therapeutic response. This hasbeen demonstrated for the analgesic effect of codeine mediated by itsCYP2D6-formed metabolite morphine. At the other extreme, ultra-rapidmetabolizers fail to respond to standard doses. Recent studies havedetermined that ultra-rapid metabolism is attributable to CYP2D6 geneamplification.

By analogy, genetic polymorphism or mutation may lead to allelicvariants of 12q23-qter genes in the population which have differentlevels of activity. The 12q23-qter polypeptides or polynucleotidesthereby allow a clinician to ascertain a genetic predisposition that canaffect treatment modality. In addition, genetic mutation or variants atother genes may potentiate or diminish the activity of12q23-qter-targeted drugs. Thus, in a 12q23-qter gene-based treatment, apolymorphism or mutation may give rise to individuals that are more orless responsive to treatment. Accordingly, dosage would necessarily bemodified to maximize the therapeutic effect within a given populationcontaining the polymorphism. As an alternative to genotyping, specificpolymorphic polypeptides or polynucleotides can be identified.

To identify genes that modify 12q23-qter-targeted drug response, severalpharmacogenetic methods can be used. One pharmacogenomics approach,“genome-wide association”, relies primarily on a high-resolution map ofthe human genome. This high-resolution map shows previously identifiedgene-related markers (e.g., a “bi-allelic” gene marker map whichconsists of 60,000-100,000 polymorphic or variable sites on the humangenome, each of which has two variants). A high-resolution genetic mapcan then be compared to a map of the genome of each of a statisticallysignificant number of patients taking part in a Phase II/III drug trialto identify markers associated with a particular observed drug responseor side effect. Alternatively, a high-resolution map can be generatedfrom a combination of some ten million known single nucleotidepolymorphisms (SNPs) in the human genome. Given a genetic map based onthe occurrence of such SNPs, individuals can be grouped into geneticcategories depending on a particular pattern of SNPs in their individualgenome. In this way, treatment regimens can be tailored to groups ofgenetically similar individuals, taking into account traits that may becommon among such genetically similar individuals (see, e.g., D. R.Pfost et al., 2000, Trends Biotechnol. 18(8):334-8).

As another example, the “candidate gene approach”, can be used.According to this method, if a gene that encodes a drug target is known,all common variants of that gene can be fairly easily identified in thepopulation and it can be determined if having one version of the geneversus another is associated with a particular drug response.

As yet another example, a “gene expression profiling approach”, can beused. This method involves testing the gene expression of an animaltreated with a drug (e.g., a 12q23-qter polypeptide, polynucleotide,analog, or modulator) to determine whether gene pathways related totoxicity have been turned on.

Information obtained from one of the approaches described herein can beused to establish a pharmacogenetic profile, which can be used todetermine appropriate dosage and treatment regimens for prophylactic ortherapeutic treatment an individual. A pharmacogenetic profile, whenapplied to dosing or drug selection, can be used to avoid adversereactions or therapeutic failure and thus enhance therapeutic orprophylactic efficiency when treating a subject with a 12q23-qterpolypeptide, polynucleotide, analog, antagonist, inhibitor, ormodulator.

The 12q23-qter polypeptides or polynucleotides of the invention are alsouseful for monitoring therapeutic effects during clinical trials andother treatment. Thus, the therapeutic effectiveness of an agent that isdesigned to increase or decrease gene expression, polypeptide levels, oractivity can be monitored over the course of treatment using the12q23-qter compositions or modulators. For example, monitoring can beperformed by: 1) obtaining a pre-administration sample from a subjectprior to administration of the agent; 2) detecting the level ofexpression or activity of the protein in the pre-administration sample;3) obtaining one or more post-administration samples from the subject;4) detecting the level of expression or activity of the polypeptide inthe post-administration samples; 5) comparing the level of expression oractivity of the polypeptide in the pre-administration sample with thepolypeptide in the post-administration sample or samples; and 6)increasing or decreasing the administration of the agent to the subjectaccordingly.

Gene Therapy:

The 12q23-qter polynucleotides (e.g., SEQ ID NO:1 to SEQ ID NO:92 andSEQ ID NO:156 to SEQ ID NO:4687) and polypeptides (e.g., SEQ ID NO:93 toSEQ ID NO:155) of the invention also find use as gene therapy reagents.In recent years, significant technological advances have been made inthe area of gene therapy for both genetic and acquired diseases (Kay etal., 1997, Proc. Natl. Acad. Sci. USA, 94:12744-12746). Gene therapy canbe defined as the transfer of DNA for therapeutic purposes. Improvementin gene transfer methods has allowed for development of gene therapyprotocols for the treatment of diverse types of diseases. Gene therapyhas also taken advantage of recent advances in the identification of newtherapeutic genes, improvement in both viral and non-viral gene deliverysystems, better understanding of gene regulation, and improvement incell isolation and transplantation. Gene therapy would be carried outaccording to generally accepted methods as described by, for example,Friedman, 1991, Therapy for Genetic Diseases, Friedman, Ed., OxfordUniversity Press, pages 105-121.

Vectors for introduction of genes both for recombination and forextrachromosomal maintenance are known in the art, and any suitablevector may be used. Methods for introducing DNA into cells such aselectroporation, calcium phosphate co-precipitation, and viraltransduction are known in the art, and the choice of method is withinthe competence of one skilled in the art (Robbins (ed), 1997, GeneTherapy Protocols, Human Press, NJ). Cells transformed with a 12q23-qtergene can be used as model systems to study chromosome 12 disorders andto identify drug treatments for the treatment of such disorders.

Gene transfer systems known in the art may be useful in the practice ofthe gene therapy methods of the present invention. These include viraland non-viral transfer methods. A number of viruses have been used asgene transfer vectors, including polyoma, i.e., SV40 (Madzak et al.,1992, J. Gen. Virol., 73:1533-1536), adenovirus (Berkner, 1992, Curr.Top. Microbiol. Immunol., 158:39-6; Berkner et al., 1988, BioTechniques, 6:616-629; Gorziglia et al., 1992, J. Virol., 66:4407-4412;Quantin et al., 1992, Proc. Natl. Acad. Sci. USA, 89:2581-2584;Rosenfeld et al., 1992, Cell, 68:143-155; Wilkinson et al., 1992, Nucl.Acids Res., 20:2233-2239; Stratford-Perricaudet et al., 1990, Hum. GeneTher., 1:241-256), vaccinia virus (Mackett et al., 1992, Biotechnology,24:495-499), adeno-associated virus (Muzyczka, 1992, Curr. Top.Microbiol. Immunol., 158:91-123; Ohi et al., 1990, Gene, 89:279-282),herpes viruses including HSV and EBV (Margolskee, 1992, Curr. Top.Microbiol. Immunol., 158:67-90; Johnson et al., 1992, J. Virol.,66:2952-2965; Fink et al., 1992, Hum. Gene Ther., 3:11-19; Breakfield etal., 1987, Mol. Neurobiol., 1:337-371; Fresse et al., 1990, Biochem.Pharmacol., 40:2189-2199), and retroviruses of avian (Brandyopadhyay etal., 1984, Mol. Cell Biol., 4:749-754; Petropouplos et al., 1992, J.Virol., 66:3391-3397), murine (Miller, 1992, Curr. Top. Microbiol.Immunol., 158:1-24; Miller et al., 1985, Mol. Cell Biol., 5:431-437;Sorge et al., 1984, Mol. Cell Biol., 4:1730-1737; Mann et al., 1985, J.Virol., 54:401-407), and human origin (Page et al., 1990, J. Virol.,64:5370-5276; Buchschalcher et al., 1992, J. Virol., 66:2731-2739). Mosthuman gene therapy protocols have been based on disabled murineretroviruses.

Non-viral gene transfer methods known in the art include chemicaltechniques such as calcium phosphate coprecipitation (Graham et al.,1973, Virology, 52:456-467; Pellicer et al., 1980, Science,209:1414-1422), mechanical techniques, for example microinjection(Anderson et al., 1980, Proc. Natl. Acad. Sci. USA, 77:5399-5403; Gordonet al., 1980, Proc. Natl. Acad. Sci. USA, 77:7380-7384; Brinster et al.,1981, Cell, 27:223-231; Constantini et al., 1981, Nature, 294:92-94),membrane fusion-mediated transfer via liposomes (Feigner et al., 1987,Proc. Natl. Acad. Sci. USA, 84:7413-7417; Wang et al., 1989,Biochemistry, 28:9508-9514; Kaneda et al., 1989, J. Biol. Chem.,264:12126-12129; Stewart et al., 1992, Hum. Gene Ther., 3:267-275; Nabelet al., 1990, Science, 249:1285-1288; Lim et al., 1992, Circulation,83:2007-2011), and direct DNA uptake and receptor-mediated DNA transfer(Wolff et al., 1990, Science, 247:1465-1468; Wu et al., 1991,BioTechniques, 11:474-485; Zenke et al., 1990, Proc. Natl. Acad. Sci.USA, 87:3655-3659; Wu et al., 1989, J. Biol. Chem., 264:16985-16987;Wolff et al., 1991, BioTechniques, 11:474-485; Wagner et al., 1991,Proc. Natl. Acad. Sci. USA, 88:4255-4259; Cotten et al., 1990, Proc.Natl. Acad. Sci. USA, 87:4033-4037; Curiel et al., 1991, Proc. Natl.Acad. Sci. USA, 88:8850-8854; Curiel et al., 1991, Hum. Gene Ther.,3:147-154).

In one approach, plasmid DNA is complexed with a polylysine-conjugatedantibody specific to the adenovirus hexon protein, and the resultingcomplex is bound to an adenovirus vector. The trimolecular complex isthen used to infect cells. The adenovirus vector permits efficientbinding, internalization, and degradation of the endosome before thecoupled DNA is damaged. In another approach, liposome/DNA is used tomediate direct in vivo gene transfer. While in standard liposomepreparations the gene transfer process is non-specific, localized invivo uptake and expression have been reported in tumor deposits, forexample, following direct in situ administration (Nabel, 1992, Hum. GeneTher., 3:399-410).

Suitable gene transfer vectors possess a promoter sequence, preferably apromoter that is cell-specific and placed upstream of the sequence to beexpressed. The vectors may also contain, optionally, one or moreexpressible marker genes for expression as an indication of successfultransfection and expression of the nucleic acid sequences contained inthe vector. In addition, vectors can be optimized to minimize undesiredimmunogenicity and maximize long-term expression of the desired geneproduct(s) (see Nabe, 1999, Proc. Natl. Acad. Sci. USA 96:324-326).Moreover, vectors can be chosen based on cell-type that is targeted fortreatment. Notably, gene transfer therapies have been initiated for thetreatment of various pulmonary diseases (see, e.g., M. J. Welsh, 1999,J. Clin. Invest. 104(9):1165-6; D. L. Ennist, 1999, Trends Pharmacol.Sci. 20:260-266; S. M. Albelda et al., 2000, Ann. Intern. Med.132:649-660; E. Alton and C. Kitson C., 2000, Expert Opin. Investig.Drugs. 9(7):1523-35).

Illustrative examples of vehicles or vector constructs for transfectionor infection of the host cells include replication-defective viralvectors, DNA virus or RNA virus (retrovirus) vectors, such asadenovirus, herpes simplex virus and adeno-associated viral vectors.Adeno-associated virus vectors are single stranded and allow theefficient delivery of multiple copies of nucleic acid to the cell'snucleus. Preferred are adenovirus vectors. The vectors will normally besubstantially free of any prokaryotic DNA and may comprise a number ofdifferent functional nucleic acid sequences. An example of suchfunctional sequences may be a DNA region comprising transcriptional andtranslational initiation and termination regulatory sequences, includingpromoters (e.g., strong promoters, inducible promoters, and the like)and enhancers which are active in the host cells. Also included as partof the functional sequences is an open reading frame (polynucleotidesequence) encoding a protein of interest. Flanking sequences may also beincluded for site-directed integration. In some situations, the5′-flanking sequence will allow homologous recombination, thus changingthe nature of the transcriptional initiation region, so as to providefor inducible or non-inducible transcription to increase or decrease thelevel of transcription, as an example.

In general, the encoded and expressed 12q23-qter polypeptide may beintracellular, i.e., retained in the cytoplasm, nucleus, or in anorganelle, or may be secreted by the cell. For secretion, the naturalsignal sequence present in a 12q23-qter polypeptide may be retained.When the polypeptide or peptide is a fragment of a 12q23-qter protein, asignal sequence may be provided so that, upon secretion and processingat the processing site, the desired protein will have the naturalsequence. Specific examples of coding sequences of interest for use inaccordance with the present invention include the 12q23-qterpolypeptide-coding sequences disclosed herein.

As previously mentioned, a marker may be present for selection of cellscontaining the vector construct. The marker may be an inducible ornon-inducible gene and will generally allow for positive selection underinduction, or without induction, respectively. Examples of marker genesinclude neomycin, dihydrofolate reductase, glutamine synthetase, and thelike. The vector employed will generally also include an origin ofreplication and other genes that are necessary for replication in thehost cells, as routinely employed by those having skill in the art. Asan example, the replication system comprising the origin of replicationand any proteins associated with replication encoded by a particularvirus may be included as part of the construct. The replication systemmust be selected so that the genes encoding products necessary forreplication do not ultimately transform the cells. Such replicationsystems are represented by replication-defective adenovirus (see G.Acsadi et al., 1994, Hum. Mol. Genet. 3:579-584) and by Epstein-Barrvirus. Examples of replication defective vectors, particularly,retroviral vectors that are replication defective, are BAG, (see Priceet al., 1987, Proc. Natl. Acad. Sci. USA, 84:156; Sanes et al., 1986,EMBO J., 5:3133). It will be understood that the final gene constructmay contain one or more genes of interest, for example, a gene encodinga bioactive metabolic molecule. In addition, cDNA, syntheticallyproduced DNA or chromosomal DNA may be employed utilizing methods andprotocols known and practiced by those having skill in the art.

According to one approach for gene therapy, a vector encoding a12q23-qter polypeptide is directly injected into the recipient cells (invivo gene therapy). Alternatively, cells from the intended recipientsare explanted, genetically modified to encode a 12q23-qter polypeptide,and reimplanted into the donor (ex vivo gene therapy). An ex vivoapproach provides the advantage of efficient viral gene transfer, whichis superior to in vivo gene transfer approaches. In accordance with exvivo gene therapy, the host cells are first transfected with engineeredvectors containing at least one gene encoding a 12q23-qter polypeptide,suspended in a physiologically acceptable carrier or excipient such assaline or phosphate buffered saline, and the like, and then administeredto the host. The desired gene product is expressed by the injectedcells, which thus introduce the gene product into the host. Theintroduced gene products can thereby be utilized to treat or amelioratea disorder (e.g., asthma, obesity, or inflammatory bowel disease) thatis related to altered levels of the 12q23-qter polypeptide.

Animal Models

In accordance with the present invention, 12q23-qter polynucleotides(e.g., SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 to SEQ ID NO:4687)can be used to generate genetically altered non-human animals or humancell lines. Any non-human animal can be used; however typical animalsare rodents, such as mice, rats, or guinea pigs. Genetically engineeredanimals or cell lines can carry a gene that has been altered to containdeletions, substitutions, insertions, or modifications of thepolynucleotide sequence (e.g., exon sequence). Such alterations mayrender the gene nonfunctional, (i.e., a null mutation) producing a“knockout” animal or cell line. In addition, genetically engineeredanimals can carry one or more exogenous or non-naturally occurringgenes, i.e., “transgenes”, that are derived from different organisms(e.g., humans), or produced by synthetic or recombinant methods.Genetically altered animals or cell lines can be used to study12q23-qter gene function, regulation, and treatments for12q23-qter-related diseases. In particular, knockout animals and celllines can be used to establish animal models and in vitro models for12q23-qter-related illnesses, respectively. In addition, transgenicanimals expressing human 12q23-qter can be used in drug discoveryefforts.

A “transgenic animal” is any animal containing one or more cells bearinggenetic information altered or received, directly or indirectly, bydeliberate genetic manipulation at a subcellular level, such as bytargeted recombination or microinjection or infection with recombinantvirus. The term “transgenic animal” is not intended to encompassclassical cross-breeding or in vitro fertilization, but rather is meantto encompass animals in which one or more cells are altered by, orreceive, a recombinant DNA molecule. This recombinant DNA molecule maybe specifically targeted to a defined genetic locus, may be randomlyintegrated within a chromosome, or it may be extrachromosomallyreplicating DNA.

Transgenic animals can be selected after treatment of germline cells orzygotes. For example, expression of an exogenous 12q23-qter gene or avariant can be achieved by operably linking the gene to a promoter andoptionally an enhancer, and then microinjecting the construct into azygote (see, e.g., Hogan et al., Manipulating the Mouse Embryo, ALaboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y.). Such treatments include insertion of the exogenous gene anddisrupted homologous genes. Alternatively, the gene(s) of the animalsmay be disrupted by insertion or deletion mutation of other geneticalterations using conventional techniques (see, e.g., Capecchi, 1989,Science, 244:1288; Valancuis et al., 1991, Mol. Cell Biol., 11:1402;Hasty et al., 1991, Nature, 350:243; Shinkai et al., 1992, Cell, 68:855;Mombaerts et al., 1992, Cell, 68:869; Philpott et al., 1992, Science,256:1448; Snouwaert et al., 1992, Science, 257:1083; Donehower et al.,1992, Nature, 356:215).

In one aspect of the invention, 12q23-qter gene knockout mice can beproduced in accordance with well-known methods (see, e.g., M. R.Capecchi, 1989, Science, 244:1288-1292; P. Li et al., 1995, Cell80:401-411; L. A. Galli-Taliadoros et al., 1995, J. Immunol. Methods181(1):1-15; C. H. Westphal et al., 1997, Curr. Biol. 7(7):530-3; S. S.Cheah et al., 2000, Methods Mol. Biol. 136:455-63). The disclosed murine12q23-qter genomic clone can be used to prepare a 12q23-qter targetingconstruct that can disrupt 12q23-qter in the mouse by homologousrecombination at the 12q23-qter chromosomal locus. The targetingconstruct can comprise a disrupted or deleted 12q23-qter gene sequencethat inserts in place of the functioning portion of the native mousegene. For example, the construct can contain an insertion in the12q23-qter protein-coding region.

Preferably, the targeting construct contains markers for both positiveand negative selection. The positive selection marker allows theselective elimination of cells that lack the marker, while the negativeselection marker allows the elimination of cells that carry the marker.In particular, the positive selectable marker can be an antibioticresistance gene, such as the neomycin resistance gene, which can beplaced within the coding sequence of a 12q23-qter gene to render itnon-functional, while at the same time rendering the constructselectable. The herpes simplex virus thymidine kinase (HSV tk) gene isan example of a negative selectable marker that can be used as a secondmarker to eliminate cells that carry it. Cells with the HSV tk gene areselectively killed in the presence of gangcyclovir. As an example, apositive selection marker can be positioned on a targeting constructwithin the region of the construct that integrates at the locus of the12q23-qter gene. The negative selection marker can be positioned on thetargeting construct outside the region that integrates at the locus ofthe 12q23-qter gene. Thus, if the entire construct is present in thecell, both positive and negative selection markers will be present. Ifthe construct has integrated into the genome, the positive selectionmarker will be present, but the negative selection marker will be lost.

The targeting construct can be employed, for example, in embryonal stemcell (ES). ES cells may be obtained from pre-implantation embryoscultured in vitro (M. J. Evans et al., 1981, Nature 292:154-156; M. O.Bradley et al., 1984, Nature 309:255-258; Gossler et al., 1986, Proc.Natl. Acad. Sci. USA 83:9065-9069; Robertson et al., 1986, Nature322:445-448; S. A. Wood et al., 1993, Proc. Natl. Acad. Sci. USA90:4582-4584). Targeting constructs can be efficiently introduced intothe ES cells by standard techniques such as DNA transfection or byretrovirus-mediated transduction. Following this, the transformed EScells can be combined with blastocysts from a non-human animal. Theintroduced ES cells colonize the embryo and contribute to the germ lineof the resulting chimeric animal (R. Jaenisch, 1988, Science240:1468-1474). The use of gene-targeted ES cells in the generation ofgene-targeted transgenic mice has been previously described (Thomas etal., 1987, Cell 51:503-512) and is reviewed elsewhere (Frohman et al.,1989, Cell 56:145-147; Capecchi, 1989, Trends in Genet. 5:70-76;Baribault et al., 1989, Mol. Biol. Med. 6:481-492; Wagner, 1990, EMBO J.9:3025-3032; Bradley et al., 1992, Bio/Technology 10: 534-539).

Several methods can be used to select homologously recombined murine EScells. One method employs PCR to screen pools of transformant cells forhomologous insertion, followed by screening individual clones (Kim etal., 1988, Nucleic Acids Res. 16:8887-8903; Kim et al., 1991, Gene103:227-233). Another method employs a marker gene is constructed whichwill only be active if homologous insertion occurs, allowing theserecombinants to be selected directly (Sedivy et al., 1989, Proc. Natl.Acad. Sci. USA 86:227-231). For example, the positive-negative selection(PNS) method can be used as described above (see, e.g., Mansour et al.,1988, Nature 336:348-352; Capecchi, 1989, Science 244:1288-1292;Capecchi, 1989, Trends in Genet. 5:70-76). In particular, the PNS methodis useful for targeting genes that are expressed at low levels.

The absence of functional 12q23-qter gene in the knockout mice can beconfirmed, for example, by RNA analysis, protein expression analysis,and functional studies. For RNA analysis, RNA samples are prepared fromdifferent organs of the knockout mice and the 12q23-qter transcript isdetected in Northern blots using oligonucleotide probes specific for thetranscript. For protein expression detection, antibodies that arespecific for the 12q23-qter polypeptide are used, for example, in flowcytometric analysis, immunohistochemical staining, and activity assays.Alternatively, functional assays are performed using preparations ofdifferent cell types collected from the knockout mice.

Several approaches can be used to produce transgenic mice. In oneapproach, a targeting vector is integrated into ES cell by homologousrecombination, an intrachromosomal recombination event is used toeliminate the selectable markers, and only the transgene is left behind(A. L. Joyner et al., 1989, Nature 338(6211):153-6; P. Hasty et al.,1991, Nature 350(6315):243-6; V. Valancius and O. Smithies, 1991, Mol.Cell Biol. 11(3):1402-8; S. Fiering et al., 1993, Proc. Natl. Acad. Sci.USA 90(18):8469-73). In an alternative approach, two or more strains arecreated; one strain contains the gene knocked-out by homologousrecombination, while one or more strains contain transgenes. Theknockout strain is crossed with the transgenic strain to produce newline of animals in which the original wild-type allele has been replaced(although not at the same site) with a transgene. Notably, knockout andtransgenic animals can be produced by commercial facilities (e.g., TheLerner Research Institute, Cleveland, Ohio; B&K Universal, Inc.,Fremont, Calif.; DNX Transgenic Sciences, Cranbury, N.J.; IncyteGenomics, Inc., St. Louis, Mo.).

Transgenic animals (e.g., mice) containing a nucleic acid molecule whichencodes a human 12q23-qter polypeptide, may be used as in vivo models tostudy the overexpression of a 12q23-qter gene. Such animals can also beused in drug evaluation and discovery efforts to find compoundseffective to inhibit or modulate the activity of a 12q23-qter gene, suchas for example compounds for treating respiratory disorders, diseases,or conditions. One having ordinary skill in the art can use standardtechniques to produce transgenic animals which produce a human12q23-qter polypeptide, and use the animals in drug evaluation anddiscovery projects (see, e.g., U.S. Pat. No. 4,873,191 to Wagner; U.S.Pat. No. 4,736,866 to Leder).

In another embodiment of the present invention, the transgenic animalcan comprise a recombinant expression vector in which the nucleotidesequence that encodes a human 12q23-qter polypeptide is operably linkedto a tissue specific promoter whereby the coding sequence is onlyexpressed in that specific tissue. For example, the tissue specificpromoter can be a mammary cell specific promoter and the recombinantprotein so expressed is recovered from the animal's milk.

In yet another embodiment of the present invention, a 12q23-qter gene“knockout” can be produced by administering to the animal antibodies(e.g., neutralizing antibodies) that specifically recognize anendogenous 12q23-qter polypeptide. The antibodies can act to disruptfunction of the endogenous 12q23-qter polypeptide, and thereby produce anull phenotype. In one specific example, an orthologous mouse 12q23-qterpolypeptide or peptide can be used to generate antibodies. Theseantibodies can be given to a mouse to knockout the function of the mouse12q23-qter ortholog.

In another embodiment of the present invention, non-mammalian organismsmay be used to study 12q23-qter genes and 12q23-qter-related diseases.In particular, model organisms such as C. elegans, D. melanogaster, andS. cerevisiae may be used. Orthologs of 12q23-qter genes can beidentified in these model organisms, and mutated or deleted to producestrains deficient for 12q23-qter genes. Human 12q23-qter genes can thenbe tested for the ability to “complement” the deficient strains. Suchstrains can also be used for drug screening. The 12q23-qter orthologscan be used to facilitate the understanding of the biological functionof the human 12q23-qter genes, and assist in the identification ofbinding factors (e.g., agonists, antagonists, and inhibitors).

Gene Identification

To identify genes in the region on 12q23-qter, a set of bacterialartificial chromosome (BAC) clones containing this chromosomal regionwas identified. The BAC clones served as a template for genomic DNAsequencing and as reagents for identifying coding sequences by directcDNA selection. Genomic sequencing and direct cDNA selection were usedto characterize DNA from 12q23-qter in accordance with the methodsdescribed in detail herein.

When a gene has been genetically localized to a specific chromosomalregion, the genes in this region can be characterized at the molecularlevel by a series of steps that include: (1) cloning the entire regionof DNA in a set of overlapping genomic clones (physical mapping); (2)characterizing the genes encoded by these clones by a combination ofdirect cDNA selection, exon trapping and DNA sequencing (geneidentification); and (3) identifying mutations in these genes bycomparative DNA sequencing of affected and unaffected members of thekindreds and/or in unrelated affected individuals and unrelatedunaffected controls (mutation analysis).

Physical mapping is accomplished by screening libraries of human DNAcloned in vectors that are propagated in a host such as E. coli, usinghybridization or PCR assays from unique molecular landmarks in thechromosomal region of interest. To generate a physical map of thedisorder region, a library of human DNA cloned in BACs was screened witha set overgo markers that had been previously mapped to chromosome12q23-qter by the efforts of the Human Genome Project. Overgos areunique molecular landmarks in the human genome that can be assayed byhybridization. Through the combined efforts of the Human Genome Project,the location of thousands of overgos on the twenty-two autosomes and twosex chromosomes has been determined. For a positional cloning effort,the physical map is tied to the genetic map because the markers used forgenetic mapping can also be used as overgos for physical mapping. Byscreening a BAC library with a combination of overgos derived fromgenetic markers, genes, and random DNA fragments, a physical mapcomprised of overlapping clones representing all of the DNA in achromosomal region of interest can be assembled.

BACs are cloning vectors for large (80 kilobase to 200 kilobase)segments of human or other DNA that are propagated in E. coli. Toconstruct a physical map using BACs, a library of BAC clones is screenedso that individual clones harboring the DNA sequence corresponding to agiven overgo or set of overgos are identified. Throughout most of thehuman genome, the overgo markers are spaced approximately 20 to 50kilobases apart, so that an individual BAC clone typically contains atleast two overgo markers. In addition, the BAC libraries that werescreened contain enough cloned DNA to cover the human genome twelvetimes over. Accordingly, an individual overgo typically identifies morethan one BAC clone. By screening a twelve-fold coverage BAC library witha series of overgo markers spaced approximately 50 kilobases apart, aphysical map consisting of a series of overlapping contiguous BACclones, i.e., BAC “contigs,” can be assembled for any region of thehuman genome. This map is closely tied to the genetic map because manyof the overgo markers used to prepare the physical map are also geneticmarkers.

When constructing a physical map, it often happens that there are gapsin the overgo map of the genome that result in the inability to identifyBAC clones that are overlapping in a given location. Typically, thephysical map is first constructed from a set of overgos identifiedthrough the publicly available literature and World Wide Web resources.The initial map consists of several separate BAC contigs that areseparated by gaps of unknown molecular distance. To identify BAC clonesthat fill these gaps, it is necessary to develop new overgo markers fromthe ends of the clones on either side of the gap. This is done bysequencing the terminal 200 to 300 base pairs of the BACs flanking thegap, and developing a PCR or hybridization based assay. If the terminalsequences are demonstrated to be unique within the human genome, thenthe new overgo can be used to screen the BAC library to identifyadditional BACs that contain the DNA from the gap in the physical map.To assemble a BAC contig that covers a region the size of the disorderregion (6,000,000 or more base pairs), it is necessary to develop newovergo markers from the ends of a number of clones.

After building a BAC contig, this set of overlapping clones serves as atemplate for identifying the genes encoded in the chromosomal region.Gene identification can be accomplished by many methods. Three methodsare commonly used: (1) a set of BACs selected from the BAC contig torepresent the entire chromosomal region can be sequenced, andcomputational methods can be used to identify all of the genes, (2) theBACs from the BAC contig can be used as a reagent to clone cDNAscorresponding to the genes encoded in the region by a method termeddirect cDNA selection, or (3) the BACs from the BAC contig can be usedto identify coding sequences by selecting for specific DNA sequencemotifs in a procedure called exon trapping. The present inventionincludes chromosome 12q23-qter genes identified by the first twomethods.

To sequence the entire BAC contig representing the disorder region, aset of BACs can be chosen for subcloning into plasmid vectors andsubsequent DNA sequencing of these subclones. Since the DNA cloned inthe BACs represents genomic DNA, this sequencing is referred to asgenomic sequencing to distinguish it from cDNA sequencing. To initiatethe genomic sequencing for a chromosomal region of interest, severalnon-overlapping BAC clones are chosen. DNA for each BAC clone isprepared, and the clones are sheared into random small fragments, whichare subsequently cloned into standard plasmid vectors such as pUC18. Theplasmid clones are then grown to propagate the smaller fragments, andthese are the templates for sequencing. To ensure adequate coverage andsequence quality for the BAC DNA sequence, sufficient plasmid clones aresequenced to yield three-fold coverage of the BAC clone. For example, ifthe BAC is 100 kilobases long, then phagemids are sequenced to yield 300kilobases of sequence. Since the BAC DNA was randomly sheared prior tocloning in the phagemid vector, the 300 kilobases of raw DNA sequencecan be assembled by computational methods into overlapping DNA sequencestermed sequence contigs. For the purposes of initial gene identificationby computational methods, three-fold coverage of each BAC is sufficientto yield twenty to forty sequence contigs of 1000 base pairs to 20,000base pairs.

The sequencing strategy employed in this invention was to initiallysequence “seed” BACs from the BAC contig in the disorder region. Thesequence of the “seed” BACs was then used to identify minimallyoverlapping BACs from the contig, and these were subsequently sequenced.In this manner, the entire candidate region can be sequenced, withseveral small sequence gaps left in each BAC. This sequence serves asthe template for computational gene identification.

In one approach, genes can be identified by comparing the sequence ofBAC contig to publicly available databases of cDNA and genomicsequences, e.g., UniGene, dbEST, EMBL nucleotide database, GenBank, andthe DNA Database of Japan (DDBJ). The BAC DNA sequence can also betranslated into protein sequence, and the protein sequence can be usedto search publicly available protein databases, e.g., GenPept, EMBLprotein database, Protein Information Resource (PIR), Protein Data Bank(PDB), and SWISS-PROT. These comparisons are typically done using theBLAST family of computer algorithms and programs (Altschul et al., 1990,J. Mol. Biol., 215:403-410; Altschul et al, 1997, Nucl. Acids Res.,25:3389-3402). For nucleotide queries, BLASTN, BLASTX, and TBLASTX canbe used. BLASTN compares a nucleotide query sequence with a nucleotidesequence database; BLASTX compares a nucleotide query sequencetranslated in all reading frames against a protein sequence database;TBLASTX compares the six-frame translations of a nucleotide querysequence against the six-frame translations of a nucleotide sequencedatabase. For protein queries, BLASTP and TBLASTN can be used. BLASTPcompares a protein query sequence with a protein sequence database;TBLASTN compares a protein query sequence against a nucleotide sequencedatabase dynamically translated in all reading frames.

Additionally, computer algorithms such as MZEF (Zhang, 1997, Proc. Natl.Acad. Sci. USA 94:565-568), GRAIL (Uberbacher et al., 1996, MethodsEnzymol. 266:259-281), and Genscan (Burge and Karlin, 1997, J. Mol.Biol., 268:78-94) can be used to predict the location of exons in thesequence based on the presence of specific DNA sequence motifs that arecommon to all exons, as well as the presence of codon usage typical ofhuman protein encoding sequences.

In addition to identifying genes by computational methods, genes can beidentified by direct cDNA selection (Del Mastro and Lovett, 1996,Methods in Molecular Biology, Humana Press Inc., NJ). In direct cDNAselection, cDNA pools from tissues of interest are prepared, and BACsfrom the candidate region are used in a liquid hybridization assay tocapture the cDNAs which base pair to coding regions in the BAC. In themethods described herein, the cDNA pools were created from severaldifferent tissues by random priming and oligo dT priming the firststrand cDNA from poly A⁺ RNA, synthesizing the second-strand cDNA bystandard methods, and adding linkers to the ends of the cDNA fragments.In this approach, the linkers are used to amplify the cDNA pools of BACclones from the disorder region identified by screening a BAC library.The amplified products are then used as a template for initiating DNAsynthesis to create a biotin labeled copy of BAC DNA. Following this,the biotin labeled copy of the BAC DNA is denatured and incubated withan excess of the PCR amplified, linkered cDNA pools which have also beendenatured. The BAC DNA and cDNA are allowed to anneal in solution, andheteroduplexes between the BAC and the cDNA are isolated usingstreptavidin coated magnetic beads. The cDNAs that are captured by theBAC are then amplified using primers complimentary to the linkersequences, and the hybridization/selection process is repeated for asecond round. After two rounds of direct cDNA selection, the cDNAfragments are cloned, and a library of these direct selected fragmentsis created.

The cDNA clones isolated by direct selection are analyzed by twomethods. Since a pool of BACs from the disorder region is used toprovide the genomic target DNA sequence, the cDNAs must be mapped to BACgenomic clones to verify their chromosomal location. This isaccomplished by arraying the cDNAs in microtiter dishes, and replicatingtheir DNA in high-density grids. Individual genomic clones known to mapto the region are then hybridized to the grid to identify directselected cDNAs mapping to that region. cDNA clones that are confirmed tocorrespond to individual BACs are sequenced. To determine whether thecDNA clones isolated by direct selection share sequence identity orsimilarity to previously identified genes, the DNA and protein codingsequences are compared to publicly available databases using the BLASTfamily of programs.

The combination of genomic DNA sequence and cDNA sequence provided byBAC sequencing and by direct cDNA selection yields an initial list ofputative genes in the region. The genes in the region were allcandidates for the asthma locus. To further characterize each gene,Northern blots were performed to determine the size of the transcriptcorresponding to each gene, and to determine which putative exons weretranscribed together to make an individual gene. For Northern blotanalysis of each gene, probes were prepared from direct selected cDNAclones or by PCR amplifying specific fragments from genomic DNA, cDNA orfrom the BAC encoding the putative gene of interest. The Northern blotsgave information on the size of the transcript and the tissues in whichit was expressed. For transcripts that were not highly expressed, it wassometimes necessary to perform a reverse transcription PCR assay usingRNA from the tissues of interest as a template for the reaction.

Gene identification by computational methods and by direct cDNAselection provides unique information about the genes in a region of achromosome. When genes are identified, then it is possible to examinedifferent individuals for mutations in each gene. Variants in genesequences between individuals can be inherited allelic differences orcan arise from mutations in the individuals. Gene sequence variants areclinically important in that they can affect drug action on such gene.Most drugs elicit a safe response in only a fraction of individuals, anddrugs are commonly administered to patients with no certainty that theywill be safe and effective. Many important drugs are effective in only30-40% of patients for whom the drug is prescribed, and virtually alldrugs cause adverse events in some individuals. Identification ofmutations in disorder genes in different individuals will enable acorrelation between the safety and efficacy of drug therapies used totreat lung diseases and the genotypes of the treated individuals. Thiscorrelation enables health care providers to prescribe a drug regimenthat is most appropriate for the individual patient rather than tryingdifferent drug regimens in turn until a successful drug is identified.Identification of variants in disorder genes will also have a benefitduring the development of new drugs for the treatment of lung diseases,as the ability to correlate genetic variation with the efficacy of newcandidate drugs will enhance lead optimization and increase theefficiency and success rate of new drug approvals.

Gene identification by computational methods and by direct cDNAselection provides unique information about the genes in a region of achromosome. Once genes are identified, it is possible to examinesubjects for sequence variants. Variant sequences can be inherited asallelic differences or can arise from spontaneous mutations. Inheritedalleles can be analyzed for linkage to a disease susceptibility locus.Linkage analysis is possible because of the nature of inheritance ofchromosomes from parents to offspring. During meiosis, the two parentalhomologs pair to guide their proper separation to daughter cells. Whilethey are paired, the two homologs exchange pieces of the chromosomes, inan event called “crossing over” or “recombination.” The resultingchromosomes contain parts that originate from both parental homologs.The closer together two sequences are on the chromosome, the less likelythat a recombination event will occur between them, and the more closelylinked they are.

Data obtained from the different families can be combined and analyzedtogether by a computer using statistical methods described herein. Theresults can then be used as evidence for linkage between the geneticmarkers used and an asthma susceptibility locus. In general, arecombination frequency of 1% is equivalent to approximately 1 map unit,a relationship that holds up to frequencies of about 20% or 20 cM. Onecentimorgan (cM) is roughly equivalent to 1,000 Kb of DNA. The entirehuman genome is 3,300 cM long. In order to find an unknown disease genewithin 5-10 cM of a marker locus, the whole human genome can be searchedwith roughly 330 informative marker loci spaced at approximately 10 cMintervals (Botstein et al., 1980, Am. J. Hum. Genet. 32:314-331).

The reliability of linkage results is established by using a number ofstatistical methods. The methods most commonly used for the detection bylinkage analysis of oligogenes involved in the etiology of a complextrait are non-parametric or model-free methods which have beenimplemented into the computer programs MAPMAKER/SIBS (L. Kruglyak and E.S. Lander, 1995, Am. J. Hum. Genet. 57:439-454) and GENEHUNTER (L.Kruglyak et al., 1996, Am. J. Hum. Genet. 58:1347-1363). Typically,linkage analysis is performed by typing members of families withmultiple affected individuals at a given marker locus and evaluating ifthe affected members (excluding parent-offspring pairs) share alleles atthe marker locus that are identical by descent (IBD) more often thanexpected by chance alone.

As a result of the rapid advances in mapping the human genome over thelast few years, and concomitant improvements in computer methodology, ithas become feasible to carry out linkage analyses using multi-pointdata. Multi-point analysis provides a simultaneous analysis of linkagebetween the trait and several linked genetic markers, when therecombination distance among the markers is known. A LOD score statisticis computed at multiple locations along a chromosome to measure theevidence that a susceptibility locus is located nearby. A LOD score isthe logarithm base 10 of the ratio of the likelihood that asusceptibility locus exists at a given location to the likelihood thatno susceptibility locus is located there. By convention, when testing asingle marker, a total LOD score greater than +3.0 (that is, odds oflinkage being 1,000 times greater than odds of no linkage) is consideredto be significant evidence for linkage.

Multi-point analysis is advantageous for two reasons. First, theinformativeness of the pedigrees is usually increased. Each pedigree hasa certain amount of potential information, dependent on the number ofparents heterozygous for the marker loci and the number of affectedindividuals in the family. However, few markers are sufficientlypolymorphic as to be informative in all those individuals. If multiplemarkers are considered simultaneously, then the probability of anindividual being heterozygous for at least one of the markers is greatlyincreased. Second, an indication of the position of the disease geneamong the markers may be determined. This allows identification offlanking markers, and thus eventually allows identification of a smallregion in which the disease gene resides.

EXAMPLES

The examples as set forth herein are meant to exemplify the variousaspects of the present invention and are not intended to limit theinvention in any way.

Example 1 Family Collection

Asthma is a complex disorder that is influenced by a variety of factors,including both genetic and environmental effects. Complex disorders aretypically caused by multiple interacting genes, some contributing todisease development and some conferring a protective effect. The successof linkage analyses in identifying chromosomes with significant LODscores is achieved in part as a result of an experimental designtailored to the detection of susceptibility genes in complex diseases,even in the presence of epistasis and genetic heterogeneity. Alsoimportant are rigorous efforts in ascertaining asthmatic families thatmeet strict guidelines, and collecting accurate clinical information.

Given the complex nature of the asthma phenotype, non-parametricaffected sib pair analyses were used to analyze the genetic data. Thisapproach does not require parameter specifications such as mode ofinheritance, disease allele frequency, penetrance of the disorder, orphenocopy rates. Instead, it determines whether the inheritance patternof a chromosomal region is consistent with random segregation. If it isnot, affected siblings inherit identical copies of alleles more oftenthan expected by chance. Because no models for inheritance are assumed,allele-sharing methods tend to be more robust than parametric methodswhen analyzing complex disorders. They do, however, require largersample sizes to reach statistically significant results.

At the outset of the program, the goal was to collect 400 affectedsib-pair families for the linkage analyses. Based on a genome scan withmarkers spaced ˜10 cM apart, this number of families was predicted toprovide >95% power to detect an asthma susceptibility gene that causedan increased risk to first-degree relatives of 3-fold or greater. Theassumed relative risk of 3-fold was consistent with epidemiologicalstudies in the literature that suggest an increased risk ranging from 3-to 7-fold. The relative risk was based on gender, differentclassifications of the asthma phenotype (i.e., bronchialhyper-responsiveness versus physician's diagnosis) and, in the case ofoffspring, whether one or both parents were asthmatic.

The family collection efforts exceeded the initial goal of 400, andresulted in a total of 444 affected sibling pair (ASP) families, with342 families from the UK and 102 families from the US. The ASP familiesin the US collection were Caucasian with a minimum of two affectedsiblings that were identified through both private practice andcommunity physicians as well as through advertising. A total of 102families were collected in Kansas, Nebraska, and Southern California. Inthe UK collection, Caucasian families with a minimum of two affectedsiblings were identified through physicians' registers in a regionsurrounding Southampton and including the Isle of Wight. In both the USand UK collections, additional affected and unaffected sibs werecollected whenever possible.

An additional 63 families from the United Kingdom were utilized from anearlier collection effort with different ascertainment criteria. Thesefamilies were recruited either: 1) without reference to asthma andatopy; or 2) by having at least one family member or at least two familymembers affected with asthma. The randomly ascertained samples wereidentified from general practitioner registers in the Southampton area.For families with affected members, the probands were recruited fromhospital based clinics in Southampton. Seven pedigrees extended beyond asingle nuclear family. The phenotypic and genotypic data information for17 markers for 21 of these 63 families was obtained from the websitehttp://cedar.genetics.soton.ac.uk/pub/PROGRAMS/BETA/data/bet12.ped.

Families were included in the study if they met all of the followingcriteria: 1) the biological mother and biological father were Caucasianand agreed to participate in the study; 2) at least two biologicalsiblings were alive, each with a current physician diagnosis of asthma,and were 5 to 21 years of age; and 3) the two siblings were currentlytaking asthma medications on a regular basis. This included regular,intermittent use of inhaled or oral bronchodilators and regular use ofcromolyn, theophylline, or steroids.

Families were excluded from the study if they met any one of thefollowing criteria: 1) both parents were affected (i.e., with a currentdiagnosis of asthma, having asthma symptoms, or on asthma medications atthe time of the study); 2) any of the siblings to be included in thestudy was less than 5 years of age; 3) any asthmatic family member to beincluded in the study was taking beta-blockers at the time of the study,4) any family member to be included in the study had congenital oracquired pulmonary disease at birth (e.g., cystic fibrosis), a historyof serious cardiac disease (myocardial infarction), or any history ofserious pulmonary disease (e.g., emphysema); or 5) any family member tobe included in the study was pregnant.

An extensive clinical instrument was designed and data from allparticipating family members were collected. The case report form (CRF)included questions on demographics, medical history includingmedications, a health survey on the incidence and frequency of asthma,wheeze, eczema, hay fever, nasal problems, smoking, and questions onhome environment. Data from a video questionnaire designed to showvarious examples of wheeze and asthmatic attacks were also included inthe CRF. Clinical data, including skin prick tests to 8 commonallergens, total and specific IgE levels, and bronchialhyper-responsiveness following a methacholine challenge, were alsocollected from all participating family members. All data were enteredinto a SAS dataset by IMTCI, a CRO; either by double data entry orscanning followed by on-screen visual validation. An extensive automatedreview of the data was performed on a routine basis and a full audit atthe conclusion of the data entry was completed to verify the accuracy ofthe dataset.

Example 2 Genome Scan

In order to identify chromosomal regions linked to asthma, theinheritance pattern of alleles from genetic markers spanning the genomewas assessed on the collected family resources. As described above,combining these results with the segregation of the asthma phenotype inthese families allows the identification of genetic markers that aretightly linked to asthma. In turn, this provides an indication of thelocation of genes predisposing affected individuals to asthma. Thegenotyping strategy was twofold: 1) to conduct a genome wide scan usingmarkers spaced at approximately 10 cM intervals; and 2) to target tenchromosomal regions for high density genetic mapping. The initialcandidate regions for high-density mapping were chosen based onsuggestions of linkage to these regions by other investigators.

Genotypes of PCR amplified simple sequence microsatellite geneticlinkage markers were determined using ABI model 377 Automated Sequencers(PE Applied Biosystems). Microsatellite markers were obtained fromResearch Genetics Inc. (Huntsville, Ala.) in the fluorescentdye-conjugated form (see Dubovsky et al., 1995, Hum. Mol. Genet.4(3):449-452). The markers comprised a variation of a human linkagemapping panel as released from the Cooperative Human Linkage Center(CHLC), also known as the Weber lab screening set version 8. Thevariation of the Weber 8 screening set consisted of 529 markers with anaverage spacing of 6.9 cM (autosomes only) and 7.0 cM (all chromosomes).Eighty-nine percent of the markers consisted of either tri- ortetra-nucleotide microsatellites. There were no gaps present inchromosomal coverage greater than 17.5 cM.

Study subject genomic DNA (5 μl; 4.5 ng/μl) was amplified in a 10 μl PCRreaction using AmpliTaq Gold DNA polymerase (0.225 U); 1×PCR buffer (80mM (NH₄)₂SO₄; 30 mM Tris-HCl (pH 8.8); 0.5% Tween-20); 200 μM each dATP,dCTP, dGTP and dTTP; 1.5-3.5 μM MgCl₂; and 250 μM forward and reversePCR primers. PCR reactions were set up in 192 well plates (Costar) usinga Tecan Genesis 150 robotic workstation equipped with a refrigerateddeck. PCR reactions were overlaid with 20 μl mineral oil, andthermocycled on an MJ Research Tetrad DNA Engine equipped with four 192well heads using the following conditions: 92° C. for 3 min; 6 cycles of92° C. for 30 sec, 56° C. for 1 min, 72° C. for 45 sec; followed by 20cycles of 92° C. for 30 sec, 55° C. for 1 min, 72° C. for 45 sec; and a6 min incubation at 72° C.

PCR products of 8-12 microsatellite markers were subsequently pooledinto two 96-well microtitre plates (2.0 μl PCR product from TET and FAMlabeled markers, 3.0 μl HEX labeled markers) using a Tecan Genesis 200robotic workstation and brought to a final volume of 25 μl with H₂O.Following this, 1.9 μl of pooled PCR product was transferred to aloading plate and combined with 3.0 μl loading buffer (2.5 μlformamide/blue dextran (9.0 mg/ml), 0.5 μl GS-500 TAMRA labeled sizestandard, ABI). Samples were denatured in the loading plate for 4 min at95° C., placed on ice for 2 min, and electrophoresed on a 5% denaturingpolyacrylamide gel (FMC on the ABI 377XL). Samples (0.8 μl) were loadedonto the gel using an 8 channel Hamilton Syringe pipettor.

Each gel consisted of 62 study subjects and 2 control subjects (CEPHparents ID #1331-01 and 1331-02, Coriell Cell Repository, Camden, N.J.).Genotyping gels were scored in duplicate by investigators blind topatient identity and affection status using GENOTYPER analysis softwareV 1.1.12 (ABI; PE Applied Biosystems). Nuclear families were loaded ontothe gel with the parents flanking the siblings to facilitate errordetection. The final tables obtained from the GENOTYPER output for eachgel analysed were imported into a SYBASE Database.

Allele calling (binning) was performed using the SYBASE version of theABAS software (Ghosh et al., 1997, Genome Research 7:165-178). Offsizebins were checked manually and incorrect calls were corrected orblanked. The binned alleles were then imported into the program MENDEL(Lange et al., 1988, Genetic Epidemiology, 5:471) for inheritancechecking using the USERM13 subroutine (Boehnke et al., 1991, Am. J. Hum.Genet. 48:22-25). Non-inheritance was investigated by examining thegenotyping traces and, once all discrepancies were resolved, thesubroutine USERM13 was used to estimate allele frequencies.

Example 3 Linkage Analysis

Chromosomal regions harboring asthma susceptibility genes wereidentified by linkage analysis of genotyping data and three separatephenotypes, asthma, bronchial hyper-responsiveness, and atopic status.

1. Asthma Phenotype:

For the initial linkage analysis, the phenotype and asthma affectionstatus were defined by a patient who answered the following questions inthe affirmative: i) Have you ever had asthma? ii) Do you have a currentphysician's diagnosis of asthma? and iii) Are you currently takingasthma medications? Medications included inhaled or oralbronchodilators, cromolyn, theophylline, or steroids. Multipoint linkageanalyses of allele sharing in affected individuals were performed usingthe MAPMAKER/SIBS analysis program (L. Kruglyak and E. S. Lander, 1995,Am. J. Hum. Genet. 57:439-454). The analyses were performed using 54polymorphic markers spanning a 162 cM region on both arms of chromosome12. The map location and distances between markers were obtained fromthe genetic maps published by the Marshfield medical researchfoundation; Marshfield, Mich. Ambiguous ordering of markers in theMarshfield map was resolved using the program MULTIMAP (T. C. Matise etal., 1994, Nature Genet. 6:384-390).

FIG. 1A shows the multipoint LOD score against the map location ofmarkers along chromosome 12. A Maximum LOD Score (MLS) of 2.9, based on484 nuclear families, was obtained at location 161.7 cM, 1.0 cM distalto markers D12S97 and D12S1045. An excess sharing by descent (IdentityBy Descent; IBD=2) of 0.31 was observed at the MLS. Table 1B shows thetwo-point and multipoint LOD scores at each marker.

TABLE 1B CHROMOSOME 12 LINKAGE ANALYSIS Marker Distance Two-pointMultipoint D12S372 6.4 0.0 0.0 GATA49D12 17.7 0.0 0.0 D12S77 20.3 0.00.0 D12S391 26.2 0.0 0.0 D12S358 26.2 0.0 0.0 D12S364 30.6 0.2 0.0D12S373 36.1 0.0 0.0 D12S1042 48.7 0.0 0.0 GATA91H06 56.3 0.0 0.0D12S368 66.0 0.2 0.3 D12S398 68.2 0.2 0.4 D12S83 75.2 1.1 0.0 D12S129478.1 0.0 0.0 IFNgama 80.4 0.0 0.0 D12S375 80.5 0.3 0.0 D12S43 80.5 0.30.0 D12S1052 83.2 0.0 0.0 D12S92 83.2 1.0 0.0 D12S326 86.4 0.1 0.1D12S64 89.4 0.0 0.2 D12S379 93.7 0.0 0.1 D12S311 94.5 0.1 0.0 D12S8295.0 0.1 0.1 D12S819 95.0 0.0 0.1 D12S1064 95.0 0.0 0.0 D12S95 96.1 0.20.2 D12S829 97.2 0.1 0.6 D12S1706 104.1 0.6 0.4 D12S1300 104.1 0.2 0.3D12S1727 107.2 0.0 0.1 D12S1607 107.9 0.0 0.1 IGF1 109.5 0.0 0.0 PAH109.5 0.0 0.0 D12S360 111.3 0.0 0.0 D12S338 111.9 0.0 0.0 D12S78 111.90.0 0.0 D12S811 120.7 0.1 0.3 D12S1341 123.0 0.0 0.5 NOS1 123.1 0.1 0.4D12S2070 125.3 0.2 0.7 D12S366 133.3 1.2 1.7 D12S1619 134.5 0.8 1.8D12S385 135.1 2.0 1.6 PLA2G1B 136.8 0.9 1.4 D12S395 136.8 2.1 1.5D12S300 140.2 0.9 1.7 D12S342 144.8 1.6 2.2 D12S324 147.2 1.3 1.4D12S2078 149.6 0.9 1.9 D12S1659 155.9 0.3 1.6 D12S97 160.7 0.9 2.7D12S1045 160.7 3.0 2.8 D12S392 165.7 1.1 2.3 D12S357 168.8 0.8 1.1

2. Phenotypic Subgroups:

Nuclear families were ascertained by the presence of at least twoaffected siblings with a current physician's diagnosis of asthma, aswell as the use of asthma medication. In the initial analysis (seeabove), the evidence was examined for linkage based on that dichotomousphenotype (asthma—yes/no). To further characterize the linkage signals,additional quantitative traits were measured in the clinical protocol.Since quantitative trait loci (QTL) analysis tools with correction forascertainment were not available, the following approach was taken torefine the linkage and association analyses:

-   -   i. Phenotypic subgroups that could be indicative of an        underlying genotypic heterogeneity were identified. Asthma        subgroups were defined according to 1) bronchial        hyper-responsiveness (BHR) to methacholine challenge; or 2)        atopic status using quantitative measures like total serum IgE        and specific IgE to common allergens.    -   ii. Non-parametric linkage analyses were performed on subgroups        to test for the presence of a more homogeneous sub-sample. If        genetic heterogeneity was present in the sample, the amount of        allele sharing among phenotypically similar siblings was        expected to increase in the appropriate subgroup in comparison        to the full sample. A narrower region of significant increased        allele sharing was also expected to result unless the overall        LOD score decreased as a consequence of having a smaller sample        size and of using an approximate partitioning of the data.

3. Results for BHR and IgE:

PC₂₀, the concentration of methacholine resulting in a 20% drop in FEV₁(forced expiratory volume), was polychotomized into four groups andanalyses were performed on the subsets of asthmatic children withborderline to severe BHR (PC₂₀≦16 mg/ml) or PC₂₀(16). As shown in theLOD plot in FIG. 1B, the MLS for the subset of 218 nuclear families withat least two PC₂₀(16) affected sibs was 2.2 at D12S342 with an excesssharing of 0.33. The linkage results implicated a region of chromosome12 centromeric to the region with the largest signal under the asthmaphenotype (FIG. 1A), and indicated the presence of one or more geneswith specific susceptibility toward BHR. Since the BHR samplerepresented a subset of the sample of asthmatics, it elucidated thepresence of multiple peaks in the LOD plot of FIG. 1A.

Total IgE was dichotomized using an age specific cutoff for elevatedlevels (one standard deviation above the mean: 52 kU/L for age 5-9; 63kU/L for age 10-14; 75 kU/L for age 15-18; and 81 kU/L for adults).Similarly, a dichotomous variable was created using specific IgE tocommon allergens. An individual was assigned a high specific IgE valueif his/her level was positive (grass or tree) or elevated (>0.35 KU/Lfor cat, dog, mite A, mite B, alternaria, or ragweed) for at least onesuch measure.

In linkage analyses, the subset of asthmatic children with high totalIgE (274 families) gave a maximum LOD score of 2.3 at D12S1619 (FIG. 1C)with an excess sharing of 0.33. The subset with high specific IgE (288families) gave a LOD score of 2.2 at 164.2 cM, 1.5 cM proximal to markerD12S392 with an excess sharing of 0.33 (FIG. 1D). The analysis with thesubset of asthmatic sibs with elevated total IgE implicated a regionsimilar to the one identified with the BHR subset. The region implicatedby the subset of asthmatic with elevated specific IgE coincided with thelocation of the largest signal in the original asthma sample.

Accordingly, a pattern of evidence by linkage analysis pointed to theexistence of several asthma susceptibility loci in the 12q23-ter regionof chromosome 12. This was supported by the initial analysis of theasthma (yes/no) phenotype with further localization by analyses of BHR,total IgE, and specific IgE in asthmatic individuals. Thus, chromosome12q23-ter encompassed genes involved in asthma and related diseasesthereof.

Example 4 Physical Mapping

The linkage results for chromosome 12 described above were used todelineate a candidate region for disorder-associated gene(s) located onchromosome 12. Gene discovery efforts were initiated in a ˜43 cMinterval from marker D12S2070 to the 12q telomere, representing a 99%confidence interval. All genes known to map to this interval wereconsidered candidates. FIGS. 2A-2P show genes mapped against the GB4panel and FIGS. 3A-3G show genes mapped against the Stanford G3 panel.The figures were obtained directly from the GeneMap99 web site.

Physical mapping (BAC contig construction) focused on a ˜22 cM intervalapproximately between markers D12S307 and D12S2341. The discovery ofnovel genes using direct cDNA selection focused on a ˜15 cM regionbetween markers D12S1609 and D12S357. FIG. 4 shows the integration ofthe Marshfield Center for Medical Genetics genetic map with GeneMap99from NCBI. The relevant regions are indicated at the top of the figure.

The following section describes the construction of a BAC contigspanning the disorder gene region on chromosome 12. This approach wasused: 1) to provide genomic clones for DNA sequencing (analysis of thissequence would provide information about the gene content of theregion); and 2) to provide reagents for direct cDNA selection (andprovide additional information about novel genes mapping to theinterval). The physical map consisted of an ordered set of molecularlandmarks, and a set of BACs (U.-J. Kim et al., 1996, Genomics34:213-218; H. Shizuya et al., 1992, Proc. Natl. Acad. Sci. USA89:8794-8797) that contained the disorder gene region from humanchromosome 12q23-qter.

FIGS. 5A-5I show the BAC/STS content contig map of human chromosome12q23-qter. Markers used to screen the RPCI-11 BAC library (P. deJong,Roswell Park Cancer Institute (RPCI)) are shown in the top row. Markersthat were present in the Genome Database (GDB, Research TriangleInstitute (RTI) International; Research Triangle, N.C.) are representedby GDB nomenclature. The BAC clones are shown below the markers ashorizontal lines.

1. Map Integration.

Various publicly available mapping resources were utilized to identifyexisting STS (sequence tagged site) markers in the 12q23-qter region(Olson et al., 1989, Science, 245:1434-1435). Resources included GDB,Genethon, the Marshfield Center for Medical Genetics, the WhiteheadInstitute Genome Center (Cambridge, Mass.), GeneMap98, dbSTS, and dbEST(NCBI), the Sanger Centre (United Kingdom), and the Stanford HumanGenome Center (Stanford, Calif.). Maps were integrated manually toidentify markers mapping to the disorder region. A list of markers isshown in Table 2.

2. Marker Development:

Sequences for existing STSs were obtained from the GDB, Radiation HybridDatabase (RHDB; United Kingdom), or NCBI, and were used to pick primerpairs (overgos; see Table 2) for BAC library screening. Novel markerswere developed from publicly available genomic sequences, proprietarycDNA sequences, or from sequences derived from BAC insert ends(described below). Primers were chosen using a script that automaticallyperforms vector and repetitive sequence masking using CROSSMATCH (P.Green, University of Washington). Subsequent primer selection wasperformed using a customized Filemaker Pro database (Filemaker, Inc.;Santa Clara, Calif.). Primers for use in PCR-based clone confirmation orradiation hybrid mapping (described below) were chosen using the programPrimer3 (Steve Rozen, Helen J. Skaletsky, 1996, 1997, Rozen, S.,Skaletsky, H. “Primer3 on the WWW for general user and for biologistprogrammers.” In S. Krawetz and S. Misener, eds. Bioinformatics Methodsand Protocols in the series Methods in Molecular Biology. Humana Press,Totowa, N.J., 2000, pages 365-386).

TABLE 2 PRIMER PAIRS Seq Seq DNA ID ID Marker name Locus type GeneForward primer NO: Reverse primer NO: B0610N03-A1.x BACendCAAGCGATAGTTCTAATTTTCT 4689 TATGTGTTGGAGCCAGAAAATT 4714 B0600D18-A2.xBACend TGGTGTTCTCTGAGCTTCCAGG 4690 ACCGAACCAAAGATCCTGGAAG 4715B0611O14-A2.x BACend GTCTTGATTTTAAGGTTTGAGG 4691 CTGCCCTCACCTTGCCTCAAAC4716 B0700A09-A2.x BACend GCTGCTTCCAGCATTTCAGCAT 4692CAGTGTTATATGTGATGCTGAA 4717 B0716I10-A2.x BACend ATGATGCAGTGAGTGAGACCCA4693 CTTACTCACTACACTGGGTCTC 4718 B1118B13-A2.x BACendGCACTGGGTCTTCTCATCTGCT 4694 ACTCTCGTGGATAGAGCAGATG 4719 B1128N10-A2.xBACend CACGAGAGTCTAGTGGGGGTTT 4695 TCACTTGGCAGATGAAACCCCC 4720B0841C17-A2.x BACend TCCCCTGATATCCACTATCTTT 4696 CATTAGATGATGGTAAAGATAG4721 B0904G06-A2.x BACend ACTGTCTCATTCTTTACAGAAA 4697GGAACAGCAAACGTTTTCTGTA 4722 B0923J13-A2.x BACend CAGGTCTCTGCAGAGCATTTCT4698 GACTCTTGTTAACGAGAAATGC 4723 B0675M15-A2.x BACendGCAGACAATATCAAGAGTTCTT 4699 CTGTAACACATCTCAAGAACTC 4724 B0600D18-A2.yBACend TCATCTGCCAAGTGAGCCCAGT 4700 GACCTCACCAAAGCACTGGGCT 4725B0610N03-A2.y BACend GATACCAATGTGAAGTCCTTGA 4701 GTTTTCTTCCAGCCTCAAGGAC4726 B0700A09-A2.y BACend TCTCGATCCCACTAACCACGAT 4702ATGAAGTACATTGGATCGTGGT 4727 B1118B13-A2.y BACend ACTGGAATGCTCAGCTGGATGC4703 TTCTCCAGGGTCAAGCATCCAG 4728 B1128N10-A2.y BACendTGCTGATCTCTCAGTTCACCCT 4704 GCAAGCCACCCATCAGGGTGAA 4729 B0904G06-A2.yBACend ATCTAATGCTGTGGCCGCTGCT 4705 GGTTTGTTTGCTGCAGCAGCGG 4730B0923J13-A2.y BACend GACAGCCAGAGGAAACCTCTTC 4706 AAAAGTTGTCTTGGGAAGAGGT4731 B0675M15-A2.y BACend CACCTCTGGCTTTCCTACAACC 4707AGCTGTGACATGAAGGTTGTAG 4732 B0635H04-A1.x BACend AGCTTCGTCTGACCAGTCTACC4708 TTCAGGAACCACCAGGTAGACT 4733 B0666B20-A1.x BACendTGCCTGTGACTGAAGTCTTGAT 4709 GAGTGAGTAAGGAAATCAAGAC 4734 B0696D03-A1.xBACend AGGAAGAACAGAAGCAGTCTTT 4710 GTCATTATTTCCTCAAAGACTG 4735B0700H07-A1.x BACend TCCTGGGAAGCAAGAATAGGAA 4711 TCGCAGTGGCTTTGTTCCTATT4736 B0726A20-A1.x BACend ACTGTTGTCACCTCTGGGAAAG 4712AGTCTTCCAGGTCTCTTTCCCA 4737 B0761L21-A1.x BACend GAGTAAAAGAATGTGTATAGGG4713 TTTTTTGACCCACCCCCTATAC 4738 B0814G06-A1.x BACendCGAGGAAGATGTAAGAGACTGT 4739 ATTGAGGCCCCAGAACAGTCTC 4768 B0857A05-A1.xBACend TCTTTAGTCCTTTGGGAGAGCT 4740 ATTTTCCCACAGGAAGCTCTCC 4769B0895C23-A1.x BACend AGGTGCTACCTCGCTCAATCTG 4741 GGGCTGGTTGCTCACAGATTGA4770 B0949E15-A1.x BACend CTTTTGAAGACGTGGGTTCTGT 4742GAATGCAAGCACTCACAGAACC 4771 B0604M16-A1.x BACend AGCCATAAACACACATTTCTAT4743 GATGCTCTGTGCATATAGAAAT 4772 B0615D12-A1.x BACendTCCACTGAGAGTTACCAAACCC 4744 GGTATGAGAATTGTGGGTTTGG 4773 B0633K01-A1.xBACend GTTCAGATTTTATCTTGGGTAT 4745 ACTGATGACATTTGATACCCAA 4774B0663H23-A1.x BACend GAGGTCCCTATTGCTGTGTTTT 4746 CAGCCAATGAAGTCAAAACACA4775 B0696L08-A1.x BACend ATCTGTAGCCTATAGTGAACAG 4747TTTACAGTGTTTGCCTGTTCAC 4776 B0702C13-A1.x BACend GTAGTAACAGAATGGACTTTGA4748 AGAGAGGAACAGCATCAAAGTC 4777 B0702F18-A1.x BACendCTCTGCATTTCTTACTCCTTAC 4749 AAGCTTTACTACCAGTAAGGAG 4778 B0728K24-A1.xBACend TCGCAAATAGCACAAGGGACTT 4750 CACCGTTATGCAGAAAGTCCCT 4779B0738O20-A1.x BACend TGAAGTTCGGAATCCCTGATAG 4751 AGGTTCCTACTGAGCTATCAGG4780 B0866B05-A1.x BACend AGCAGAAGAGCAGACCCTTCAA 4752GGAGCATCCAATCTTTGAAGGG 4781 B0598D10-A1.y BACend AGATGCTTATACTTGGTGTAAG4753 TACTTACACAGTTGCTTACACC 4782 B0635H04-A1.y BACendAGTCACACCTTATGAGGCATCA 4754 CTGTATGAATCCTCTGATGCCT 4783 B0666B20-A1.yBACend ATCCTGCTTTGTGGGTAGCCAC 4755 AATGCCACGGTGCAGTGGCTAC 4784B0700H07-A1.y BACend ACTCAAACCAACCTTCCATTCA 4756 GGTTAGGATTAGTGTGAATGGA4785 B0726A20-A1.y BACend TCAGTTCTCAGTCCTAGGAGAC 4757GGTCTTCTACTCCAGTCTCCTA 4786 B0761L21-A1.y BACend GCGAGGCCTGCTGTCTTTCTCA4758 AAATTAGCCAGGCATGAGAAAG 4787 B0814G06-A1.y BACendGCAGAGAGGTGGTGAGTGCATC 4759 TGACAGTTTCCTTTGATGCACT 4788 B0857A05-A1.yBACend TGCTTATCAAGATGCCTTTGCC 4760 AATCAGGCCATGAGGGCAAAGG 4789B0895C23-A1.y BACend CCATCCTTCATCCCCAGCAGTA 4761 CCCTGAATTTAGGTTACTGCTG4790 B0931G12-A1.y BACend AGAACCAGGCAGAGCTACCTGG 4762CTGGACCAGGAAATCCAGGTAG 4791 B0949E15-A1.y BACend ACTAGCTATTGAAGTGACTATC4763 ATGGGCAAAGAATAGATAGTCA 4792 B0604M16-A1.y BACendGTTTCAGCTGTGGAAAATGTTA 4764 TGTCTTCCTCCCCTTAACATTT 4793 B0633K01-A1.yBACend ATGCTGCTTCATATAACACATT 4765 CGGGAAGCATTTGCAATGTGTT 4794B0663H23-A1.y BACend CTCGCTCCATCTGCGATGCACA 4766 AGGTGATCACAGACTGTGCATC4795 B0696L08-A1.y BACend TGTTGTGTCAGAAACTCAGGAA 4767ACCCAGCTGAATCCTTCCTGAG 4796 B0702C13-A1.y BACend TCATGGGGGTGCTTTGACCTTG4797 TGGCCTCAAAGGCTCAAGGTCA 4826 B0702F18-A1.y BACendCATGGTCACCTGCAGCCTCTCA 4798 TGGCTAGAAGGAGGTGAGAGGC 4827 B0738O20-A1.yBACend AGAAGCGGGGTGAGCAGGACAT 4799 GTTACCCGGGAGTTATGTCCTG 4828B0866B05-A1.y BACend GATGTTGTCCGACAGGCATGGG 4800 TTCCTGTGTAGATCCCCATGCC4829 B0883G23-A1.y BACend GTGGTAGAATTGGCAAGCCTTG 4801CTCCAATCAGTTGCCAAGGCTT 4830 B0909L16-A1.y BACend GGTAAGGACACCTTCAAGGGAC4802 TGGAGTGCCCTGTTGTCCCTTG 4831 B0974M10-A1.x BACendATGCAAAGGTCTCAGGACGAAA 4803 CCCTTCCTGGACAATTTCGTCC 4832 B1118L08-A1.xBACend GGCATGTAGATCAAATGAAATA 4804 TGCTCCTAGCTGAATATTTCAT 4833B0723P10-A1.x BACend GGTAGCAGTCTTACACTGCTGG 4805 CCTTTCCGATGACCCCAGCAGT4834 B0748H09-A1.x BACend TGCCATGTAACGTTCATATTCC 4806GTTTTCCTGTGCAGGGAATATG 4835 B0825F09-A1.x BACend ATACCCACAGGGTAGTAACAGT4807 TTGTGGCTCAAATCACTGTTAC 4836 B0825K21-A1.x BACendCGTGAGCCCATTTCAACCACAC 4808 TCCCTGTCTTTGAAGTGTGGTT 4837 B0845N16-A1.xBACend ACATATGAAAAGACCGTAGAAA 4809 CAATTCACAGGCACTTTCTACG 4838B0894N08-A1.x BACend ACGTGGAGAAGGCCGCTGTCTT 4810 CTGGACATTGAATAAAGACAGC4839 B0956I11-A1.x BACend TGAATTTTAACAGGTGGCAAAG 4811ATTCCATCTGACAGCTTTGCCA 4840 B0974M10-A1.y BACend CTCATAGTTGTTACACACTCTG4812 AAGCACGTGTTGAACAGAGTGT 4841 B0646E20-A1.y BACendCTCCATAGGAAGCAGCCATCAG 4813 ACTGGACCCAGCAACTGATGGC 4842 B0723P10-A1.yBACend TGTACCAAACTGTTGACTATTA 4814 GTTTGCCTCATGCTTAATAGTC 4843B0748H09-A1.y BACend GCCTGCACAGGACACAATTGCA 4815 TTCCGGGTTTGATGTGCAATTG4844 B0825K21-A1.y BACend CAATAATTAGTTCCAATGGCGC 4816CACAGTCAGAGTTGGCGCCATT 4845 B0845N16-A1.y BACend GAGTGCTCACCGGAAGAGAAGA4817 TCCAGAGCCAACTGTCTTCTCT 4846 B0894N08-A1.y BACendTGCCTTTCTTCCTTAGAGCTCC 4818 CATCTGGATTAGCTGGAGCTCT 4847 B0956I11-A1.yBACend TGTGGGATGCTTCCAGTTTTGT 4819 GATGAGTAGATCCCACAAAACT 4848B0961F22-A1.x BACend CATCCTGCCTCGGGTCTGAACT 4820 GGTCACTGCAGGAAAGTTCAGA4849 B0588P16-A1.x BACend AAGAAGGACCTCAACCAAGAGC 4821ACCCATGTGTGTCAGCTCTTGG 4850 B1000B21-A1.x BACend TATTACAGAGGCTGGTGATCAG4822 TAGCCTGTCAGAAGCTGATCAC 4851 B0839D11-A1.x BACendGACAACTTGCTTCCTTTACCTG 4823 AGATGACCTATTGCCAGGTAAA 4852 B1052D15-A1.xBACend CAGAAGCATAGAAACAATCCAG 4824 GCACTGTTTTATAACTGGATTG 4853B1093F08-A1.x BACend TGCTGCAACTGCCAAAGAATTC 4825 CCCTGGCGTTGCAGGAATTCTT4854 B1134M23-A1.x BACend GAATGGGGAGAAAGGGCAAAGG 4855GCTCGTTAAGAGTTCCTTTGCC 4884 B0894M06-A1.x BACend TCTTTCATCTCCTAATGGGCAC4856 TGGGTACATGCACTGTGCCCAT 4885 B0895J20-A1.x BACendACAGACACCTTGGGTCATGACT 4857 GGAACTGGATGTAAAGTCATGA 4886 B0961F22-A1.yBACend CAGTGGTCCCTCTCTCATGAGT 4858 CTGCTTCTAGAACAACTCATGA 4887B0668P23-A1.y BACend ACATGATGCACCCCTTACCGTT 4859 CCGTCTGTGTCCAGAACGGTAA4888 B0588P16-A1.y BACend ACATGGGCTCACAGGAAGATCT 4860CACGACTTAGGAGGAGATCTTC 4889 B1000B21-A1.y BACend AAGAGAAGTCGGAGACTGTGTC4861 TAGCAAGTCTTATCGACACAGT 4890 B0839D11-A1.y BACendCCACTCAACCCACAATCTAGTC 4862 GAATACAGGGATGGGACTAGAT 4891 B1052D15-A1.yBACend CCACCAAATGGATCTGTTGACT 4863 ATCAGAGGTCTGTAAGTCAACA 4892B1093F08-A1.y BACend AGGCCGGTTTCTTACTACAGAA 4864 TCGAAACAGCTGCCTTCTGTAG4893 B1134M23-A1.y BACend ACAGAAAGGCCGTGGGTAGAGA 4865TTCCTCCATTCACGTCTCTACC 4894 B0894M06-A1.y BACend CACATCGCTGCTTGACAGAACT4866 GGGTCATGTGACTGAGTTCTGT 4895 B0895J20-A1.y BACendCACATTTCTGAGACACTTGCTA 4867 TAATACCTGGCATGTAGCAAGT 4896 B0604N13-A1.xBACend ATGAGTCTCTCCACCGAATGTG 4868 GAACCTCAGTCCTGCACATTCG 4897B0714L01-A1.x BACend TCATCAGTTCTAGGAGCTTTCA 4869 GTAAGTACTCCTCCTGAAAGCT4898 B0754A14-A1.x BACend GGATCGCACAGTCACTCTTCAT 4870TGCAAGGCGATATGATGAAGAG 4899 B0894M06-A1.x BACend GATTAGTGTATGGTAGAGGACA4871 TGGTGCAGGATTGTTGTCCTCT 4900 B1128L12-A1.x BACendTTGGTGTGAATCAAGCATCAGG 4872 TGAGCACAGGAGTTCCTGATGC 4901 B0643F18-A1.yBACend GTGGATTAAACCGAGGTGGAAT 4873 CCTTTCCAGTTTGAATTCCACC 4902B0714L01-A1.y BACend GGCATTCTTGCTGCTGCTTCTG 4874 GAATACTGCAGAAGCAGAAGCA4903 B0754A14-A1.y BACend ATCCTGGGCAAGGGAGTTTCAG 4875CTGAGCCACACCTTCTGAAACT 4904 B0894M06-A1.y BACend TTGTTCACATCGCTGCTTGACA4876 ATGTGACTGAGTTCTGTCAAGC 4905 B1128L12-A1.y BACendGCTTGAACTGCACTCAGCAGGA 4877 GTGCTTCTAACTTCTCCTGCTG 4906 B0687F10-A1.xBACend TCTCTCAAGCCACTTTCTATGT 4878 ACGTGAATCACGGAACATAGAA 4907B0791C09-A1.x BACend ACTGTGGCTGCACATAGGGATA 4879 AAAGCTTCCTGGGGTATCCCTA4908 B0820N16-A1.x BACend GGACCCACCCTGTCAATTTCAT 4880GGGGCGATGGGAATATGAAATT 4909 B0880M22-A1.x BACend TGTTTGGATATGGTGGCTACTA4881 TGTGTGTTTTGAGTTAGTAGCC 4910 B1008L21-A1.x BACendATCTCTGGGAAGCTCTACAGTG 4882 CTCAAATCCCCTCCCACTGTAG 4911 B1043N20-A1.xBACend AGATAATGGGTTGCTTGGGCTC 4883 GTTAAAGCAGTTATGAGCCCAA 4912B0700H07-A2.x BACend CTTGGACTCAAGACATCCTCTG 4913 TGGGAGACTGAGACCAGAGGAT4942 B0687F10-A1.y BACend TTTCAGTGACTGCTCTTCCGTT 4914TGGCTGTAAGTGAAAACGGAAG 4943 B0791C09-A1.y BACend CATTAGAAGCCCAGGAGGAAAC4915 CTCCTTCTTCCCGAGTTTCCTC 4944 B0880M22-A1.y BACendCTATGTTGCATAGGAGTAGTGA 4916 AAGGATACCCTCTCTCACTACT 4945 B0909E24-A1.yBACend CCCTCTATAACATTTTCTCCCA 4917 CTTAGGACAACCCCTGGGAGAA 4946B1008L21-A1.y BACend GAGCCCTGCTCAGAATTTCATG 4918 GAGGCAAGGTCTTTCATGAAAT4947 B0923H14-A1.y BACend GCAGCCTTACTGAGCTGACAGT 4919CCGTCCATGGGAACACTGTCAG 4948 B0979G13-A1.y BACend CTCCACCTGGATGGGTCAACTT4920 ATTAAGTTCCTTGAAAGTTGAC 4949 B1020H18-A1.y BACendCATGATCTCAATAATTGCAACT 4921 GAAGAAAACAGGAGAGTTGCAA 4950 B0756E08-A1.yBACend ATGGGTATCACTATGCATAGCA 4922 TTTAAAATTCCACTTGCTATGC 4951B0666F01-A1.y BACend GTGTCCTGGTGAACGGCTCTGA 4923 AATCAGAGTTTCCTTCAGAGCC4952 B0883G19-A1.y BACend ACATTCCCAGCTCTACATTCTA 4924CTGAGTTTCCTCACTAGAATGT 4953 B0923H14-A1.x BACend GATTAAGAGAGGGTAGGAGGGT4925 ACCTTCCAACCATCACCCTCCT 4954 B0781I18-A1.x BACendGGATTAATAGTACCACCCCCTG 4926 ATTTAACACAAAGGCAGGGGGT 4955 B0979G13-A1.xBACend GACATTCCATGCAAATGGACAC 4927 CCCGCTTGCTTTTGGTGTCCAT 4956B1020H18-A1.x BACend CATATGGCTAAGGCTCTATCTA 4928 AATCAGCAGGTACATAGATAGA4957 B1029H23-A1.x BACend CAGCTAGGGGAAGAGTGACAGG 4929CGAAATGCCGACTGCCTGTCAC 4958 B1076C21-A1.x BACend CTAGAATTTCCATGTAGTAAGA4930 ATACTTGCTCTTTCTCTTACTA 4959 B1104N09-A1.x BACendCCTGCCTGATGAGCAAAGAATA 4931 CACTGGGTACTTCTTATTCTTT 4960 B0663J16-A1.xBACend CAACCAACTATCTGCTGCCTTC 4932 TAGGTGAGTCTCTTGAAGGCAG 4961B0656F13-A1.x BACend GGTGTGGAGAGAGTGGACTCTA 4933 TAATATAAAATCCTTAGAGTCC4962 B0883G19-A1.x BACend CATGGCACAGGTGATAGAGTGA 4934ATAATCCAGGAAGATCACTCTA 4963 B0760A04-A2.x BACend GCTCTCATGATTTGGGCATGCT4935 GTTCAAATCTTGCAAGCATGCC 4964 B0785D22-A1.x BACendGTGAACAGGCTAACACTGTTAA 4936 ATGCGTGCTGGTGTTTAACAGT 4965 B0723P10-A1.yBACend TGGAAGCCACTTAGAGGTTGCA 4937 AACAGTTTGGTACATGCAACCT 4966B1095L07-A1.x BACend TCTAAAGATGGGGCCTCACAGT 4938 ATGGCTTCAGTTTTACTGTGAG4967 B0997I04-A1.x BACend TACTTTACTCTGTTTCCTGTAT 4939AAGTGATATGAGACATACAGGA 4968 B0723P10-A1.x BACend AGGAAAGGGAAATAGAAGGGAA4940 TATCTGCGTGGTGGTTCCCTTC 4969 B0997I04-A1.y BACendAGTGTTAGTGGGAATGAGGAGT 4941 CTCCATTATCAGTCACTCCTCA 4970 B0880L16-A2.xBACend GAAACCCACATCAGCACAAAGG 4971 TTTGTGCTGGCTGGCCTTTGTG 5000B0598O21-A2.x BACend CGCCGAATTCCATGACTCTTGA 4972 TTTGGCAGAATGTTTCAAGAGT5001 B0768I12-A2.x BACend CACAAAGACAGACCCACAGCTC 4973GCTGTGGGAAATGTGAGCTGTG 5002 B1056C02-A2.x BACend CCACACAGGAAAACTGCCATCT4974 CCAATTCTCCTTTCAGATGGCA 5003 B1056C02-A2.y BACendGAGACGTGAGTCAGGACAGGTG 4975 TGCCCAATCTGTACCACCTGTC 5004 sts-AA017225 ESTGATGCCAGGAAGTACCTGGTAA 4976 GCAATCTCCAATCCTTACCAGG 5005 A004F14 ESTGGAAACCCGTGACTTGACTTAG 4977 TGTCATCAGCACCCCTAAGTCA 5006 SGC31333 ESTAGGTGGTGATCTAGTCTCCGGT 4978 GAGTGAAAGGTGGAACCGGAGA 5007 WI-12422 ESTAACCAGACAGCATCTCTGGAGAGA 4979 CACAGAGAGTGCATTTTCTCTCCA 5008 stSG21539EST ATGCATACAGCAGGCCATTGTG 4980 CAGCCCCCTATGACCACAATGG 5009 WI-13120 ESTGGGAGCTACAGGTGATAGCTAT 4981 GGGCGCATAGCTATCACCTGTA 5010 stSG22703 ESTCACCAGAGACCAGAGACTCGAA 4982 ACCATGGACAGGCCTTCGAGTC 5011 stSG36097 ESTTGAGCAGTCTGACCTGCTTCTC 4983 AGCTGGAGCACCTGGAGAAGCA 5012 stSG9807 ESTCAGCCAGCTACTGAACCTTATG 4984 TGGCCCTAGGCACACATAAGGT 5013 stSG15434 ESTTACCACCACCCTGCGCAGATGG 4985 GTANTCTGTGGCCGCCATCTGC 5014 stSG30525 ESTGGCACACAGTCTGCAATGCTTG 4986 TAGGGGACATCCCTCAAGCATT 5015 A007A34 ESTTGTTCTGGCAGATTCCATCATC 4987 CTTATGTTGGGATTGATGATGG 5016 A006D44 ESTCAGGGTCATTCGAGGAGGAACA 4988 CGAAAGCTTGAATCTGTTCCTC 5017 SGC30248 ESTGATGCAAGCAGCACAGAGCAGT 4989 CTCCTTCCCACAGCACTGCTCT 5018 sts-N20163 ESTTCTCTACCAGGCAATACTTCAC 4990 CTGAAATCGAGTGAGTGAAGTA 5019 Cda0af01 ESTAAAGGCCACACAGCCCACAATC 4991 GGCCTGCAGTGGATGATTGTGG 5020 Cda0ca07 ESTAAGTCTGACTTCAAATCGGTAC 4992 TGTCTAAGCCTCATGTACCGAT 5021 stSG3292 ESTAAGTCTGACTTCAAATCGGTAC 4993 TGTCTAAGCCTCATGTACCGAT 5022 SGC34088 ESTAAGTCAATTGCTCCCCATCTGC 4994 CTTGTTCGTTGCTGGCAGATGG 5023 WI-12272 ESTGACTCATATGACAGACCTTGAA 4995 TGTCCCACCTTTCCTTCAAGGT 5024 stSG16387 ESTCATGACTCCCAGACCCCTTAGA 4996 TGCCCAAATTCCTGTCTAAGGG 5025 SGC31722 ESTCAAACGGAGAAGCCCCAGATAC 4997 TTGTTACTGTACGTGTATCTGG 5026 WI-15018 ESTAGTGACAATTAGAGCTCTGGGG 4998 GCTCCTTCATTCTCCCCCAGAG 5027 WI-18492 ESTTGCTTGGCCAAACAGACTTCCT 4999 TGATGAGACTGCAGAGGAAGTC 5028 stSG9546 ESTACCTGAGAGCAGGGAGATTCCA 5029 TAACTCCTAGCAGCTGGAATCT 5058 A006O16 ESTCCCGAGGCTTCTCTGAACACTA 5030 CTCACAGCGCTTTCTAGTGTTC 5059 H64839 ESTAATCTGAGGCACACAGGAGAGT 5031 ACTGAGCTCCTTTCACTCTCCT 5060 stSG3357 ESTGCCTTGCTAACTGTACCATAGT 5032 CACCTGCAGGAATAACTATGGT 5061 stSG30906 ESTTCTAAGGTTCCGGATGGACGTG 5033 TGTCCCGCCAAATTCACGTCCA 5062 stSG26056 ESTGAGTTACAGGAAGTGGTTCCCC 5034 CTGCGTGTCTGTCAGGGGAACC 5063 SGC30786 ESTACAGCTCTCCTTCCTTAATGCC 5035 CACCCTTATCTCTGGGCATTAA 5064 sts-N59820 ESTAGACTGCATCCTTCGAACAACAGG 5036 ACTGGGAAATCTAGCGCCTGTTGT 5065 stSG42115EST TTCTCGAGGGTTCTCTGCTTCACT 5037 AGTTCTCTCGGGAGTTAGTGAAGC 5066 FB9F8EST GAAAAACCCGCACCCTGACACAAC 5038 CGTCCAGAAAACGTAGGTTGTGTC 5067 AA252357EST CAGCACATCGAGTCCTCAAATCCG 5039 CCAGACTTTCCTCACTCGGATTTG 5068 stSG4720EST TCGAGAAAGGCTGTTCCTACAAGG 5040 TAACCTCAGGACCTTCCCTTGTAG 5069sts-AA001424 EST AAGCTGCTCTTCTCAGCTACTCTG 5041 TTTCAGGGTTCTGGGTCAGAGTAG5070 stSG31443 EST CAAAGCACTGGACTGAGAGAATTC 5042GGTGGATACAGTGTGTGAATTCTC 5071 W1-6385 D12S1405 ESTTAAAGGCAAAGGCCACACAGCCCA 5043 CTGCAGTGGATGATTGTGGGCTGT 5072 A008Y05 ESTTAAAGATAAGGCGTGGGCTTTGAC 5044 AACTCTGGCAGACACTGTCAAAGC 5073 R50113 ESTTCATACCAAGTGCTGGCTGCTAAG 5045 CCAGTTTCTCCACATCCTTAGCAG 5074 sts-H94865EST CTCTAAGAACCAGACCCTCAGTTG 5046 CTCATTCCCTTACTGGCAACTGAG 5075 A006R19EST GGTTTGAACAGTGGGAGATACCAG 5047 TTTTCTCCTCCCACCTCTGGTATC 5076 SGC34278EST CAAACACAAGAGGTCCTCTTGCTG 5048 ACAGTCCATGGAAAGGCAGCAAGA 5077 A004B47EST GTGCCCTGTGAAATTGGCCTTTCT 5049 GCTGGAAGCAGAAAGAAGAAAGGC 5078stSG40199 EST GGAAGGCTGTCTTCTTTCTACCAC 5050 TGACACCTGCCTCATGGTGGTAGA5079 stSG8935 EST CAAACACAAGAGGTCCTCTTGCTG 5051 ACAGTCCATGGAAAGGCAGCAAGA5080 stSG4731 EST GCATGTGTTGTTTCTGTCTGGGAT 5052 AGCAGACAAGATCTAGATCCCAGA5081 stSG8142 EST GTGCCCTGTGAAATTGGCCTTTCT 5053 GCTGGAAGCAGAAAGAAGAAAGGC5082 A005X42 EST GCATGTGTTGTTTCTGTCTGGGAT 5054 AGCAGACAAGATCTAGATCCCAGA5083 CDA18G06 D12S1205E EST ACAGACTACAACGTCAATGAAGCC 5055TCCGACAATGCCAGGAGGCTTCAT 5084 STSG40222 EST TCTTCTCTCTCACTGCAGACCATG5056 TGCCCACATGGAGAAACATGGTCT 5085 sts-R55615 ESTGCTAGTGGAACGGATACCTGAAAG 5057 CTTCCTGTGGTAGTGTCTTTCAGG 5086 sts-R02295EST CTCAATCCACATGACAACGCTTTG 5087 ACCTAGTATCCTACCTCAAAGCGT 5114sts-R81342 EST GGCAAAAGGGAAAAACCATGTATG 5088 TCACTTCCCTTACAGTCATACATG5115 sts-H65839 EST AATAGATTGATTGCCGTCCTCAAC 5089AAGTATGTGCTAACTTGTTGAGGA 5116 stSG52716 EST AGATGGGGGAGACAAACGGTAAAC5090 CGGAAAGGAAACATCTGTTTACCG 5117 stSG54813 EST highly  TTTGTTGGTCAGCTGGTCCAACCA 5091 TGCAGTAATGGATGGGTGGTTGGA 5118 similarto 22 kd peroxisomal  membrane protein stSG50504 ESTCCGTATTACCCAGACTACACACTG 5092 CACCAATGGCATAGCACAGTGTGT 5119 stSG48386EST CCAGCAGCAGGATATTGTGTACGT 5093 GTTTACAGCCTACAGGACGTACAC 5120stSG54842 EST TTCTTCTTCAGGTCCCGCTCAAAG 5094 TCACGGCCTACGAGATCTTTGAGC5121 stSG53600 EST Highly   AACTGGGATGCCAACTAACACGTG 5095AAGTCTTGGGGAACTCCACGTGTT 5122 similar to peptide transporter  PTR2stSG53541 EST Homo   AACCCCACCTATGGTTGTAGTGAG 5096GGCGTAAAGTAGGATGCTCACTAC 5123 sapiens hiwi mRNA, partial cds stSG53307EST GAGGCTAGGCTGAATATAACCAGG 5097 CACTGCCAGTCAGCAACCTGGTTA 5124stSG63473 EST CCACTGGCTGCATTTTCCAGCTTT 5098 CACCAGGTACTAGAGAAAAGCTGG5125 stSG54325 EST CGGCACAAGCAGATTTCAGATCAG 5099CTGGGGGAAATGCTGACTGATCTG 5126 stSG52343 EST AACTGGAGTCAGGTGATCACGAAG5100 CCAGTGAAATAAGCCCCTTCGTGA 5127 WIAF-856 EST AAGTCAATTGCTCCCCATCTGCCA5101 TCTACTTGTTCGTTGCTGGCAGAT 5128 stSG47723 ESTCTGAGTTCCTTAGCAGCTTCCGTA 5102 TCTTCAAAGGACCTCCTACGGAAG 5129 stSG60065EST GGAGGTGAATAAGCTGATCCTGCA 5103 GCTGGGTAACTAGAAGTGCAGGAT 5130stSG46424 EST GGACACATCTGTTCCATCTTCACC 5104 CCCATGAGTTGTTAGTGGTGAAGA5131 sts-U79526 Gene DEZ TGATCCTCACTGTGGAACCCCT 5105GAGAGAGTCCATTGAGGGGTTC 5132 SGC31491 Gene NOS1 AGAGCGGCTCTTTTAATGAGGG5106 GGGAGACGTCGCAACCCTCATT 5133 stSG1936 Gene CLA-1TCAGTCCATAGGATGATGTCAG 5107 TCCTCCAGCCTAAACTGACATC 5134 sts-W31616 GeneUBA52 CCCAGCAAAGATCAACCTCTGC 5108 ATCCCTCCTGATCAGCAGAGGT 5135 ZNF10 GeneKOX 1 ATGTGGGAAGGCCTTTGGTAGT 5109 GTAAGGTTTGAGCCACTACCAA 5136 ZNF26 GeneKOX20 GTGAATGTGGAAAAGCCTTCAC 5110 GAGATGACTTCTGAGTGAAGGC 5137 WI-6921Gene RNP24 GTTGCAAGTGTTCTCACCCAAG 5111 AACCATACTTCCACCTTGGGTG 5138sts-D60472 Gene SMRT GAACGACGTGTGTAAATGACAG 5112 AGGGTGGTGGTATTCTGTCATT5139 WI-16177 Gene RAN CCTTCAGGCATCCCACAGATGA 5113CGGAACATGTGCCTTCATCTGT 5140 stSG1702 Gene CAGH32TCAGGCACCAAATCTGAACAAGGG 5141 GAAGGTTGGATCCAAGCCCTTGTT 5170 IB2452 GeneULK1 GCCATCAAGGTGATGAGGAAGAAG 5142 AAGAAAATCCCCGTGACTTCTTCC 5171stSG39493 Gene CAGH32 GTGCTGAATCTCTTGCGTGACATG 5143TAGTGAACCTTGGGACCATGTCAC 5172 A002A44 Gene CAGH32TGGTTCTCTGCTTCACTGGCAGAA 5144 GGATAAGCTTGTGTGGTTCTGCCA 5173 stSG27206Gene GCP170 GAGCACATCTGGCCTGGCCAGT 5145 TGAGGTTCTGAGTCACTGGCCA 5174CDA1JF08 Gene GCP170 AGTGAGCTCAGAACACCTCACACC 5146AGTTGAGTGACGCTGTGGTGTGAG 5175 R39599 Gene GCP170ACTTCTGCAGTCATCGAGAAGTCC 5147 CCCACAAAAGATCCCAGGACTTCT 5176 stSG31494Gene ZNF140 TCTCCAGTATGAGTCCTCTGGTGT 5148 GCTTTTCCCTGGTGTTACACCAGA 5177TH_a Gene MUC8 ATCCACCGCTAGAAACCCACTC 5149 GACCATCAACTGATGAGTGGGT 5178SGC31491_a Gene NOS1 CCTAGTAGCTTTCCTCCCAAAG 5150 ATTGGAAAGAAAGCCTTTGGGA5179 sts-X89576 Gene MMP17 AGAGGAGCTGTCTAAGGCCATC 5151TGCTGCATGGCTGTGATGGCCT 5180 stSG43910 Gene SFRS8 cagtacatgtttacccacagac5152 tgcacataagtcgacagacacc 5181 P699K7/T7 D12S2479 GenomicAGAAAGCCTCTCTTCCCCTCTCTC 5153 GTCACATTTTTGGGGTGAGAGAGG 5182 P493P14/T7D12S2451 Genomic TCTCAGGAACCAGAGTCCATAG 5154 CAGTTAGATAAAAGCTATGGAC 5183P313C9/SP6 D12S2447 Genomic CAGCTCAGGAAGTTCACCAGGC 5155AGGACCCAGTTGAAGCCTGGTG 5184 WI-5824 D12S2002 GenomicCATTTACCTGCCCGCCTGGTCA 5156 CAGGATTTGTGTGGTGACCAGG 5185 WI-10803D12S1944 Genomic CTGGATTTCCAGAGACTGACCT 5157 TCAGGCAATAGAGAAGGTCAGT 5186WI-2002 D12S1084 Genomic ACAACAGAAGTTGTCAGTGAAG 5158CTGTTCAACAGTGCCTTCACTG 5187 WI-3045 d12S1420 GenomicCTTAAGCGAGCAACCTGATAACCC 5159 TCCTAATCTGGCAGGTGGGTTATC 5188 WI-3549D12S1998 Genomic GAGAATCAGCTGCCATGTTGTGAG 5160 GGACTCTTTGAGCATCCTCACAAC5189 WI-6077 D12S1322 Genomic AGCAGCACTAGGCATGGCTGTT 5161ATAAGAGCTGAGATAACAGCCA 5190 SHGC-12243 D12S1845 GenomicCAAGCTTCCCTCCTTTCCCATTGT 5162 TTCCGGCGTTGTAGTTACAATGGG 5191 SHGC-13782D12S1851 Genomic AGTCAGGTACAGGGTTCTGACAAC 5163 CACCTTGTTCGTCTCTGTTGTCAG5192 SHGC-14238_a D12S1853 Genomic CAAGTGTCCCACTTTTCCTGCA 5164CCGCTCACTCACTCTGCAGGAA 5193 WI-3549_a D12S1998 GenomicCCATGTTGTGAGGATGCTCAAA 5165 ACCTTTTAGGACTCTTTGAGCA 5194 AFMb337xc1D12S1675 MSAT GATCTGCAGCATTGAGGGAGCA 5166 GTCTCTAGGCACATTGCTCCCT 5195AFMa197zd9 D12S1609 MSAT GGGGATTTAGTAGNTCAATGTA 5167GTCATCGGGTGACATACATTGA 5196 AFMb350zb5 D12S1679 MSATGTTTGTAGGCTTCTTGCCTCTG 5168 CCCTCTACCATTCACAGAGGCA 5197 UT7009 D12S834MSAT GTCCAAGAGTGGGCAGTTGACC 5169 ATTGGATAGGCATAGGTCAACT 5198 AFMb301we5D12S1659 MSAT TCTAACTTTCGTTTGCCTGCTT 5199 CACTGTGCTTTCAGAAGCAGGC 5214AFMa064xg9 D12S1714 MSAT GTTCGAGATCCACAGGTGTCTA 5200TGTAGCATATGATGTAGACACC 5215 CHLC.ATA19A06 D12S2069 MSATTGTTGCCTAGGCTGGTCTTGAA 5201 CTTGAGTCCAAGAGTTCAAGAC 5216 ATA29A06D12S1045 MSAT GACCAGCCTAGGCACATAGTGA 5202 TTAGAGATGGGGTCTCACTATG 5217AFM210zd6 D12S97 MSAT AATTGTCTCCATGGGGCTCGAA 5203 CCTTCACTGAGGAGTTCGAGCC5218 AFM295ye9 D12S343 MSAT TACTGCCACTCTCCAGAATATC 5204GATCTGGAAGGTCGGATATTCT 5219 509/510 D12S63 MSAT GTGGTTGGGTTAACAAAGAATG5205 GAGAAGCTGCAACGCATTCTTT 5220 AFMa275xb9 D12S1628 MSATAAGGTAGAGCTTGGCAACAGGA 5206 AGCCCCGCTGGACCTCCTGTTG 5221 AFMb002vd5D12S1638 MSAT TGCCAGGAGTTTTAAGTTGGTT 5207 GAATGGCATTTGGTAACCAACT 5222GATA13D05 D12S392 MSAT GTATGGATAGCAGACGATAGAG 5208TCTATCTGTCATCCCTCTATCG 5223 12QTEL82 D12S2342 MSATTACATTCCACCAGCAGTGCACAAG 5209 TGGAGAAATTGGAAGCCTTGTGCA 5224 12QTEL87D12S2343 MSAT TTGTTAGGCTTCTGGGTTGGGTAC 5210 ACAGGCATTAGCCCCTGTACCCAA5225 AFMa082ze9_a D12S1723 MSAT CTTCCGTCATGAATGTCAGTAG 5211TCTGCAGTGGTTCCCTACTGAC 5226 AFM156xc5_a D12S1599 MSATTGGGAAGAGTTGCCTCCAGGAA 5212 CCCTTCTCAGTCCTTTCCTGGA 5227 AFMa123xe1D12S367 MSAT CTGTATTAAATGAGTCTGGGTT 5213 GGGTTAATACAGTTAACCCAGA 5228

3 Radiation Hybrid (RH) Mapping:

Radiation hybrid mapping was performed against the Genebridge4 panel(Gyapay et al., 1996, Hum. Mol. Genet. 5:339-46) purchased from ResearchGenetics. Mapping was performed in order to refine the chromosomallocalization of genetic markers used in genotyping as well as toidentify, confirm, and refine localizations of markers from proprietarysequences. Standard PCR procedures were used for typing the RH panelwith markers of interest.

Briefly, 10 μl PCR reactions contained 25 ng DNA of each of the 93Genebridge4 RH samples. PCR products were electrophoresed on 2% agarosegels (Sigma) containing 0.5 μg/ml ethidium bromide in 1×TBE at 150 voltsfor 45 min. Model A3-1 electrophoresis systems were used (Owl ScientificProducts, Portsmouth, N.H.). Typically, gels contained 10 tiers of laneswith 50 wells/tier. Molecular weight markers (100 bp ladder, GibcoBRL,Rockville, Md.) were loaded at both ends of the gel.

Images of the gels were captured with a Kodak DC40 CCD camera andprocessed with Kodak 1D software (Eastman Kodak Comp.; Rochester, N.Y.).The gel data were exported as tab delimited text files. The names of thefiles included information about the panel screened, the gel imagefiles, and the marker screened. These data were automatically importedusing a customized Perl script into Filemaker databases for data storageand analysis. The data were then automatically formatted and submittedto an internal server for linkage analysis to create a radiation hybridmap using RHMAPPER (L. Stein et al., 1995; available from WhiteheadInstitute/MIT Center for Genome Research)

4. BAC Library Screening:

The protocol used for BAC library screening was based on the “overgo”method, originally developed by John McPherson at Washington Universityin St. Louis (http://www.tree.caltech.edu/protocols/overgo.html, andW-W. Cai et al., 1998, Genomics 54:387-397). This method involvedfilling in the overhangs generated after annealing two primers. Eachprimer was 22 nucleotides in length, and overlapped by 8 nucleotides.The resulting labeled product (36 bp) was then used inhybridization-based screening of high density grids derived from theRPCI-11 BAC library (deJong, supra). Typically, 15 probes were pooledtogether to hybridize 12 filters (13.5 genome equivalents).

Stock solutions (2 μM) of combined complementary oligos were heated at80° C. for 5 min, placed at 37° C. for 10 min, and then stored on ice.Labeling reactions included the following: 1.0 μl H₂O; 5 μl mixed oligos(2 μM each); 0.5 μl BSA (2 mg/ml); 2 μl OLB (-A, -C, -N6) Solution (seebelow); 0.5 μl ³²P-dATP (3000 Ci/mmol); 0.5 μl ³²P-dCTP (3000 Ci/mmol);and 0.5 μl Klenow fragment (5 U/μl). The reaction was incubated at RTfor 1 hr, and unincorporated nucleotides were removed using Sephadex G50spin columns. Solution O: 1.25 M Tris-HCl, pH 8, and 125 M MgCl₂;Solution A: 1 ml Solution O, 18 μl 2-mercaptoethanol, 5 μl 0.1M dTTP,and 5 μl 0.1 M dGTP; Solution B: 2 M HEPES-NaOH, pH 6.6; Solution C: 3mM Tris-HCl, pH 7.4, and 0.2 mM EDTA; Solutions A, B, and C werecombined to a final ratio of 1:2.5:1.5, and aliquots were stored at −20°C.

High-density BAC library membranes were pre-wetted in 2×SSC at 58° C.Filters were then drained slightly and placed in hybridization solution(1% BSA; 1 mM EDTA, pH 8.0; 7% SDS; and 0.5 M sodium phosphate),pre-warmed to 58° C., and incubated at 58° C. for 2-4 hr. Typically, 6filters were hybridized in each container. Ten milliliters ofpre-hybridization solution was removed, combined with the denaturedovergo probes, and added back to the filters. Hybridization wasperformed overnight at 58° C. The hybridization solution was removed andfilters were washed once in 2×SSC, 0.1% SDS, followed by a 30 min washin the same solution at 58° C. Filters were then washed in: 1) 1.5×SSCand 0.1% SDS at 58° C. for 30 min; 2) 0.5×SSC and 0.1% SDS at 58° C. for30 min; and in 3) 0.1×SSC and 0.1% SDS at 58° C. for 30 min. Filterswere then wrapped in Saran Wrap®, and exposed to film overnight. Toremove bound probe, filters were treated in 0.1×SSC and 0.1% SDSpre-warmed to 95° C., and then cooled to RT. Clone addresses weredetermined in accordance with instructions supplied by RPCI.

To recover clonal BAC cultures from the library, a sample from theappropriate library well was plated by streaking onto LB agar (T.Maniatis et al., 1982, Molecular Cloning: A Laboratory Manual, ColdSpring Harbor Laboratory, Cold Spring Harbor, N.Y.) containing 12.5μg/ml chloramphenicol (Sigma), and plates were incubated overnight. Asingle colony and a portion of the initial streak quadrant wereinoculated into in each well of a 96-well plate containing 400 μl LBplus chloramphenicol. Cultures were grown overnight at 37° C. Forstorage, 100 μl of 80% glycerol was added to each well, and the plateswere placed at −80° C.

To determine the marker content of clones, aliquots of the 96-well platecultures were transferred to the surface of nylon filters (GeneScreenPlus, NEN) placed on LB/chloramphenicol petri plates. Colonies weregrown overnight at 37° C. and colony lysis was performed by placingfilters on pools of: 1) 10% SDS for 3 min; 2) 0.5 N NaOH and 1.5 M NaClfor 5 min; and 3) 0.5 M Tris-HCl, pH 7.5, and 1 M NaCl for 5 min.Filters were then air-dried and washed free of debris in 2×SSC for 1 hr.The filters were air-dried for at least 1 hr, and DNA was crosslinkedlinked to the membrane using standard conditions. Probe hybridizationand filter washing were performed as described above for the primarylibrary screening. Confirmed clones were stored in LB containing 15%glycerol.

In certain cases, polymerase chain reaction (PCR) was used to confirmthe marker content of clones. PCR conditions for each primer pair wereoptimized with respect to MgCl₂ concentration. The standard buffercontained 10 mM Tris-HCl (pH 8.3), 50 mM KCl, MgCl₂, 0.2 mM each dNTP,0.2 μM each primer, 2.7 ng/μl human DNA, 0.25 U AmpliTaq (Perkin Elmer)and MgCl₂ concentrations of 1.0 mM, 1.5 mM, 2.0 mM or 2.4 mM. Cyclingconditions included an initial denaturation at 94° C. for 2 min; 40cycles at 94° C. for 15 sec, 55° C. for 25 sec, and 72° C. for 25 sec;and a final extension at 72° C. for 3 min. Depending on the results, theconditions were further optimized as required. For further optimization,the annealing temperature was increased to 58° C. or 60° C., the cyclenumber was increased to 42, the annealing and extension times wereincreased to 30 sec, and/or AmpliTaqGold was used (Perkin Elmer).

5. BAC DNA Preparation:

Several different types of DNA preparation methods were used to isolateBAC DNA. The manual alkaline lysis miniprep protocol listed below(Maniatis et al., 1982) was successfully used for most applications,i.e., restriction mapping, CHEF gel analysis, and FISH mapping, but thisprotocol was not reproducibly successful for endsequencing. The Autogenprotocol described below was used to isolate BAC DNA for endsequencing.

For manual alkaline lysis BAC minipreps, bacteria were grown in 15 mlterrific broth (TB) containing 12.5 μg/ml chloramphenicol. Cultures wereplaced in a 50 ml conical tube at 37° C. for 20 hr with shaking at 300rpm. Cultures were centrifuged in a Sorvall RT 6000 Dat 3000 rpm(1800×g) at 4° C. for 15 min. The supernatant was aspirated ascompletely as possible. In some cases, cell pellets were frozen at −20°C. at this step for up to 2 weeks. The pellet was then vortexed tohomogenize the cells and minimize clumping. Following this, 250 μl of P1solution (50 mM glucose, 15 mM Tris-HCl, pH 8, 10 mM EDTA, and 100 μg/mlRNase A) was added. The mixture was pipetted up and down to mix. Themixture was then transferred to a 2 ml Eppendorf tube. Subsequently, 350μl of P2 solution (0.2 N NaOH, 1% SDS) was added, mixed gently, and themixture was incubated for 5 min at RT. Then, 350 μl of P3 solution (3 MKOAc, pH 5.5) was added and mixed gently until a white precipitateformed. The solution was incubated on ice for 5 min, and thencentrifuged at 4° C. in a microfuge for 10 min.

The supernatant was transferred carefully (avoiding the whiteprecipitate) to a fresh 2 ml Eppendorf tube, and 0.9 ml of isopropanolwas added. The solution was mixed and left on ice for 5 min. The sampleswere centrifuged for 10 min, and the supernatant was carefully removed.Pellets were washed in 70% ethanol and air-dried for 5 min. Pellets werethen resuspended in 200 μl of TE8 (10 mM Tris-HCl, pH 8.0, 1.0 mM EDTA,pH 8.0), and RNase (Boehringer Mannheim; Germany) added to 100 μg/ml.Samples were incubated at 37° C. for 30 min, then precipitated byaddition of NH₄OAc to 0.5 M and 2 volumes of ethanol. Samples were thencentrifuged for 10 min, and the pellets were washed with 70% ethanol.The pellets were air-dried and dissolved in 50 μl TE8. Typical yieldsfor this DNA prep were 3-5 μg per 15 ml bacterial culture. Ten to 15 μlof DNA was used for EcoRI restriction analysis; 5 μl was used for NotIdigestion and clone insert sizing by CHEF gel electrophoresis.

Autogen 740 BAC DNA preparations were made by dispensing 3 ml of LBmedia containing 12.5 μg/ml of chloramphenicol into autoclaved Autogentubes. A single tube was used for each clone. For inoculation, glycerolstocks were removed from −70° C. storage and placed on dry ice. A smallportion of the glycerol stock was removed from the original tube with asterile toothpick and transferred into the Autogen tube. The toothpickwas left in the Autogen tube for at least 2 min before discarding. Afterinoculation the tubes were covered with tape to ensure that the seal wastight. When all samples were inoculated, the tubes were transferred intoan Autogen rack holder and placed into a rotary shaker. Cultures wereincubated at 37° C. for 16-17 hr at 250 rpm.

Following this, standard conditions for BAC DNA preparation, as definedby the manufacturer, were used to program the Autogen. However, sampleswere not dissolved in TE8 as part of the program. Instead, DNA pelletswere left dry. When the program was completed, the tubes were removedfrom the output tray and 30 μl of sterile distilled and deionized H₂Owas added directly to the bottom of the tube. The tubes were then gentlyshaken for 2-5 sec and then covered with parafilm and incubated at RTfor 1-3 hr. DNA samples were then transferred to an Eppendorf tube andused either directly for sequencing or stored at 4° C. for later use.

6. BAC Clone Characterization:

DNA samples prepared either by manual alkaline lysis or the Autogenprotocol were digested with EcoRI for analysis of restriction fragmentsizes. These data were used to compare the extent of overlap amongclones. Typically 1-2 μg DNA was used for each reaction. Reactionmixtures included: 1× Buffer 2 (NEB); 0.1 mg/ml BSA (NEB); 50 μg/mlRNase A (Boehringer-Mannheim); and 20 U EcoRI (NEB) in a final volume of25 μl. Digestions were incubated at 37° C. for 4-6 hr. BAC DNA was alsodigested with NotI for estimation of insert size by CHEF gel analysis(see below). Reaction conditions were identical to those for the EcoRIdigestion, except that 20 U NotI were used. Six microliters of 6× Ficollloading buffer containing bromphenol blue and xylene cyanol was addedprior to electrophoresis.

EcoRI digests were analyzed on 0.6% agarose gels (Seakem, FMCBioproducts, Rockland, Me.) in 1×TBE containing 0.5 μg/ml ethidiumbromide. Gels (20 cm×25 cm) were electrophoresed in a Model A4electrophoresis unit (Owl Scientific) at 50 volts for 20-24 hr.Molecular weight size markers included undigested lambda DNA, HindIIIdigested lambda DNA, and HaeIII digested X174 DNA. Molecular weightmarkers were heated at 65° C. for 2 min prior to loading the gel. Imageswere captured with a Kodak DC40 CCD camera and analyzed with Kodak 1Dsoftware.

NotI digests were analyzed on a CHEF DRII (Bio-Rad) electrophoresis unitaccording to the manufacturer's recommendations. Briefly, 1% agarosegels (Bio-Rad pulsed field grade) were prepared in 0.5×TBE, equilibratedfor 30 min in the electrophoresis unit at 14° C., and electrophoresed at6 volts/cm for 14 hr with circulation. Switching times were ramped from10 sec to 20 sec. Gels were stained after electrophoresis in 0.5 μg/mlethidium bromide. Molecular weight markers included undigested lambdaDNA, HindIII digested lambda DNA, lambda ladder PFG ladder, and lowrange PFG marker (all from NEB).

7. BAC Endsequencing:

The sequence of BAC insert ends utilized DNA prepared by either of thetwo methods described above. The ends of BAC clones were sequenced forthe purpose of filling gaps in the physical map and for gene discoveryinformation. The following vector primers specific to the BAC vectorpBACe3.6 were used to generate endsequence from BAC clones: pBAC 5′-2(TGT AGG ACT ATA TTG CTC; SEQ ID NO: 5229) and pBAC 3′-1 (CGA CAT TTAGGT GAC ACT; SEQ ID NO: 5230).

The ABI dye-terminator sequencing protocol was used to set up sequencingreactions for 96 clones. The BigDye (ABI; PE Applied Biosystems)Terminator Ready Reaction Mix with AmpliTaq” FS, Part number 4303151,was used for sequencing with fluorescently labeled dideoxy nucleotides.A master sequencing mix was prepared for each primer reaction set, andincluded: 1600 μl of BigDye terminator mix (ABI; PE Applied Biosystems);800 μl of 5×CSA buffer (ABI; PE Applied Biosystems); and 800 μl ofprimer (either pBAC 5′-2 or pBAC 3′-1 at 3.2 μM). The sequencingcocktail was vortexed to ensure it was well-mixed and 32 μl wasaliquoted into each PCR tube. Eight microliters of the Autogen DNA foreach clone was transferred from the DNA source plate to a correspondingwell of the PCR plate. The PCR plates were sealed tightly andcentrifuged briefly to collect all the reagents. Cycling conditions wereas follows: 1) 95° C. for 5 min; 2) 95° C. for 30 sec; 3) 50° C. for 20sec; 4) 65° C. for 4 min; 5) steps 2 through 4 were repeated 74 times;and 6) samples were stored at 4° C.

At the end of the sequencing reaction, the plates were removed from thethermocycler and centrifuged briefly. Centri•Sep 96 μlates were thenused according to manufacturer's recommendations to removeunincorporated nucleotides, salts, and excess primers. Each sample wasresuspended in 1.5 μl of loading dye, and 1.3 μl of the mixture wasloaded onto ABI 377 Fluorescent Sequencers. The resulting end sequenceswere then used to develop markers to rescreen the BAC library and fillsequence gaps. The end sequences were also analyzed by BLASTN toidentify EST or gene content. The BAC end sequences correspond to SEQ IDNO:156 to SEQ ID NO:693, disclosed herein.

Example 5 Subcloning and Sequencing of BACs From 12q23-qter

The physical map of the chromosome 12 region provided a set of BACclones for use as sequencing templates (see FIGS. 5A-5I). BAC DNA wasisolated according by a QIAGEN purification (QIAGEN, Inc., Valencia,Calif., per manufacturer's instructions) or a manual purification. Themanual purification method was a modification of the standard alkalinelysis/cesium chloride preparation for plasmid DNA (see e.g., F. M.Ausubel et al., 1997, Current Protocols in Molecular Biology, John Wiley& Sons, New York, N.Y.).

Briefly, for manual purification, cells were pelleted, and resuspendedin GTE (50 mM glucose, 25 mM Tris-Cl (pH 8), and 10 mM EDTA) andlysozyme (50 mg/ml solution). This was followed by addition of NaOH/SDS(1% SDS and 0.2N NaOH) and then an ice-cold solution of 3M KOAc (pH4.5-4.8). RnaseA was added to the filtered supernatant, followed byProteinase K and 20% SDS. The DNA was precipitated with isopropanol, andthen dried, and resuspended in TE (10 mM Tris, 1 mM EDTA (pH 8.0)). TheBAC DNA was further purified by cesium chloride density-gradientcentrifugation (Ausubel et al., 1997). Following isolation, the BAC DNAwas hydrodynamically sheared using HPLC (Hengen et al., 1997, Trends inBiochem. Sci., 22:273-274) to an insert size of 2000-3000 bp. Aftershearing, the DNA was concentrated and separated on a standard 1%agarose gel. A single fraction, corresponding to the approximate size,was excised from the gel and purified by electroelution (Sambrook etal., 1989).

The purified DNA fragments were then blunt-ended using T4 DNApolymerase. The blunt-ended DNA was then ligated to unique BstXI-linkeradapters (5′ GTCTTCACCACGGGG (SEQ ID NO: 5231) and 5′ GTGGTGAAGAC (SEQID NO: 5232) in 100-1000 fold molar excess. These adapters werecomplimentary to the BstXI-cut pMPX vector, whereas the BstXI-cut vectorwas not self-complimentary. Therefore, the adapters would notconcatemerize, and the cut vector would not ligate to itself. Thelinker-adapted inserts were separated from unincorporated linkers on a1% agarose gel and purified using GeneClean (BIO 101, Inc., Vista,Calif.). The linker-adapted insert was then ligated to a modifiedpBlueScript vector to construct a “shotgun” subclone library. The vectorcontained an out-of-frame lacZ gene at the cloning site, which becamein-frame in the event that an adapter-dimer was cloned. Suchadapter-dimer clones gave rise to blue colonies, which were avoided.

Sequencing was performed using ABI377 automated DNA sequencing methods.Major modifications to the protocols are highlighted as follows.Briefly, the library was transformed into DH5-competent cells (GibcoBRL,DH5-transformation protocol). Transformed cells were plated ontoantibiotic plates containing ampicillin and IPTG/X-gal. The plates wereincubated overnight at 37° C. White colonies were identified, and platedto obtain individual clones for sequencing. Cultures were grownovernight at 37° C. DNA was purified using a silica bead DNA preparationmethod (Ng et al., 1996, Nucl. Acids Res., 24:5045-5047). In thismanner, 25 μg of DNA was obtained per clone.

Purified DNA samples were sequenced using ABI dye-terminator chemistry.The ABI dye terminator sequence reads were run on ABI377 machines, andthe data were directly transferred to UNIX machines following lanetracking of the gels. All reads were assembled using PHRAP (P. Green,Abstracts of DOE Human Genome Program Contractor-Grantee Workshop V,January 1996, p. 157) with default parameters and quality scores. EachBAC was sequenced for ˜3× coverage. SEQ ID NOs for assembled contigs areshown in Table 3A, below.

TABLE 3A BAC SEQUENCES Genomic Sequence SEQ ID NO: Range RP11-666B20719-765 RP11-702C13 766-808 RP11-723P10 809-869 RP11-831E18 870-899RP11-899A17 900-927 RP11-932D22 928-978

Additional BAC sequences (GenBank (www.ncbi.nlm.nih.gov)) were alsoinvestigated as potentially containing gene or gene(s) involved inasthma and related diseases thereof.

TABLE 3B BAC SEQUENCES Genomic Sequence SEQ ID NO: AC003982 694 AC011216695 AC023437 696 AC024021 697 AC024642 698 AC025641 699 AC025837 700AC026331 701 AC026333 702 AC026336 703 AC026764 705 AC026869 704AC048337 706 AC063926 707 AC069209 708 AC073527 709 AC073862 710AC073912 711 AC073930 712 AC078925 713 AC078926 714 AC079031 715AC079602 716 AC090147 717 AC090565 718 Z98941 979

Example 6 Gene Identification

1. Gene Identification from Clustered DNA Fragments.

DNA sequences corresponding to gene fragments in public databases(GenBank and human dbEST) and proprietary cDNA sequences (IMAGEconsortium and direct selected cDNAs) were masked for repetitivesequences and clustered using the PANGEA Systems EST clustering tool(DoubleTwist, Oakland, Calif.). The clustered sequences were thensubjected to computational analysis to identify regions bearingsimilarity to known genes. This protocol included the following steps:

a. The clustered sequences were compared to the publicly availableUniGene database (NCBI) using the BLASTN2 algorithm (Altschul et al.,1997). The parameters for this search were: E=0.05, v=50, B=50, where Ewas the expected probability score cutoff, V was the number of databaseentries returned in the reporting of the results, and B was the numberof sequence alignments returned in the reporting of the results(Altschul et al., 1990).

b. The clustered sequences were compared to the GenBank database (NCBI)using BLASTN2 (Altschul et al., 1997). The parameters for this searchwere E=0.05, V=50, B=50, where E, V, and B were defined as above.

c. The clustered sequences were translated into protein sequences forall six reading frames, and the protein sequences were compared to anon-redundant protein database compiled from GenPept, SWISSPROT, and PIR(NCBI). The parameters for this search were E=0.05, V=50, B=50, where E,V, and B were defined as above.

d. The clustered sequences were compared to BAC sequences (see below)using BLASTN2 (Altschul et al., 1997). The parameters for this searchwere E=0.05, V=50, B=50, where E, V, and B were defined as above.

2. Gene Identification from BAC Genomic Sequence:

Following assembly of the BAC sequences into contigs, the contigs weresubjected to computational analyses to identify coding regions andregions bearing DNA sequence similarity to known genes. This protocolincluded the following steps:

a. Contigs were degapped. The contig sequences contained symbols thatrepresented locations where the individual ABI sequence reads hadinsertions or deletions (denoted by periods). Prior to automatedcomputational analysis of the contigs, the periods were removed. Theoriginal contig sequences were held for future reference.

b. BAC vector sequences were masked within the sequence by using theprogram CROSSMATCH (P. Green, University of Washington; Seattle, Wash.).Shotgun library construction (detailed above) left BAC vector sequencesin the shotgun libraries. The CROSSMATCH program was used to compare thesequence of the BAC contigs to the BAC vector and to mask any vectorsequence prior to subsequent steps. Masked sequences were marked by “Xs”in the sequence files, and were omitted during subsequent analyses.

c. E. coli sequences contaminating the BAC sequences were masked bycomparing the BAC contigs to the entire E. coli genome.

d. Repetitive elements known to be common in the human genome weremasked using CROSSMATCH (P. Green, University of Washington). In thisimplementation of CROSSMATCH, the BAC sequence was compared to adatabase of human repetitive elements (J. Jerka, Genetic InformationResearch Institute, Palo Alto, Calif.). The masked repeats were markedby “Xs” in the sequence files, and were omitted during subsequentanalyses.

e. The location of exons within the sequence was predicted using theMZEF computer program (Zhang, 1997, Proc. Natl. Acad. Sci., 94:565-568)and GenScan gene prediction program (Burge and Karlin, J. Mol. Biol.,268:78-94).

f. The sequence was compared to the publicly available UniGene database(NCBI) using the BLASTN2 algorithm (Altschul et al., 1997). Theparameters for this search were: E=0.05, V=50, B=50, where E was theexpected probability score cutoff, V was the number of database entriesreturned in the reporting of the results, and B was the number ofsequence alignments returned in the reporting of the results (Altschulet al., 1990).

g. The nucleotide sequence was translated into amino acid sequences forall six reading frames, and the amino acid sequences were compared to anon-redundant protein database compiled from GenPept, SWISSPROT, and PIR(NCBI). The parameters for this search were E=0.05, V=50, B=50, where E,V, and B were defined as above.

h. The BAC DNA sequence was compared to a database of clusteredsequences using the BLASTN2 algorithm (Altschul et al., 1997). Theparameters for this search were E=0.05, V=50, B=50, where E, V, and Bwere defined as above. The database of clustered sequences was preparedutilizing a proprietary clustering technology (PANGEA Systems, Inc.).The clustering program compiled cDNA clones derived from directselection experiments (described below), human dbEST sequences mappingto the 12q23-ter region, proprietary cDNAs, GenBank genes, and IMAGEconsortium cDNA clones.

i. The BAC sequence was compared to the BAC end sequences from the12q23-ter region using the BLASTN2 algorithm (Altschul et al., 1997).The parameters for this search were E=0.05, V=50, B=50, where E, V, andB were defined as above.

j. The BAC sequence was compared to the GenBank database (NCBI) usingthe BLASTN2 algorithm (Altschul et al., 1997). The parameters for thissearch were E=0.05, V=50, B=50, where E, V, and B were defined as above.

k. The BAC sequence was compared to the STS division of GenBank database(NCBI) using the BLASTN2 algorithm (Altschul et al., 1997). Theparameters for this search were E=0.05, V=50, B=50, where E, V, and Bwere defined as above.

l. The BAC sequence was compared to the Expressed Sequence Tag (EST)GenBank database (NCBI) using the BLASTN2 algorithm (Altschul et al.,1997). The parameters for this search were E=0.05, V=50, B=50, where E,V, and B were defined as above.

m. The exon prediction programs MZEF (Zhang, 1997, Proc. Natl. Acad.Sci. USA 94:565-568) and GenScan (Burge and Karlin, J. Mol. Biol.,268:78-94) were also utilized to help identify the exons.

The results of BLAST searches of protein and nucleotide databases aresummarized in Table 4. Column 1 lists the gene names, and column 2 liststhe types of sequences (i.e., Gene, Express Sequence Tag (EST), etc.).Columns 3 and 4 list the SEQ ID NOs for the nucleotide and amino acidsequences, respectively. Column 5 lists the GenBank accession numbers.Column 6 lists the descriptions of the genes or ESTs relating topotential functions. Using this information, one of ordinary skill inthe art is able to appreciate the roles of these genes and theirrelation to the disorders described herein. The seventh column lists thegenetic markers, and the eighth column lists the corresponding BACclones. The SEQ ID NOs corresponding to the BAC clones are shown inTables 3A and 3B, above. It should be noted that 12q23-qter alternatesplice variants are referred to herein using both short (e.g., 561.1,561.2, etc.; see Table 4, column 1) and long (e.g., 561.nt1, 561.nt2;see Example 14) nomenclature.

TABLE 4 IDENTIFIED GENES Gene Gene SEQ ID NO: SEQ ID NO: GenBank NumberType (NT) (AA) Accession # Description 214.1 Gene 1 93 U14383 MUC8,Mucin 8 214.2 Gene 2 94 U14383 MUC8, Mucin 8 214.3 Gene 3 95 U14383MUC8, Mucin 8 214.4 Gene 4 96 U14383 MUC8, Mucin 8 214.5 Gene 5 97U14383 MUC8, Mucin 8 215.1 Gene 6 98 NM_004072 CMKLR1, Chemokine-likereceptor 1 224.1 Gene 7 99 U17327 NOS1, Nitric oxide synthase 1,Neuronal 266.1 Gene 8 100 L07395 PPP1CC, Protein phosphatase 1,catalytic subunit, gamma isoform 283.1 Gene 9 101 AF055581 Lnk,Lymphocyte adaptor protein 292.1 Gene 10 102 AF032437 MAPKAPK5, Mitogenactivated protein kinase activated protein kinase gene 298.1 Gene 11 103D13540 PTPn11, Protein-tyrosine phosphatase 2C 321.1 Gene 12 104AB007447 Fln29, TRAF interacting Zn finger protein 399.1 Gene 13 105U14588 PXN Paxillin gamma 399.2 Gene 14 106 U14588 PXN Paxillin gamma399.3 Gene 15 107 U14588 PXN Paxillin gamma 422.1 Gene 16 108 M21054PLA2, phospholipase A2, group IB 436.1 Gene 17 109 AF191093 P2RX4, P2X4,P2x purinoreceptor, Ligand gated ion channel 436.2 Gene 18 110 AF191093P2RX4, P2X4, P2x purinoreceptor, Ligand gated ion channel 454.1 Gene 19111 Y09561 P2X7, ATP ligand gated cationic channel 515.1 Gene 20 112NM_006018 HM74, Probable G protein-coupled receptors 536.1 EST 21 113543.1 Gene 22 114 AB009010.1 UBC, ubiquitin C 548.1 Gene 23 115 Z22555.1CLA-1, CD36 antigen (collagen type I receptor, thrombospondinreceptor)-like 549.1 EST 24 AA017225 550.1 EST 25 116 A004F14 550.2 EST26 1176 A004F14 551.1 EST 27 1187 H92073 553.1 EST 28 R41805 555.1 EST29 N50054 559.1 Gene 30 119 X52351 Kox20, zinc finger protein 561.1 EST31 120 AB002316.1 RIMBP2 561.2 EST 32 121 AB002316.1 RIMBP2 562.1 EST 33T50448 566.1 Gene 34 122 AF113003.1 SMRT, Silencing mediator of retinoidand thyroid hormone action 567.1 EST 35 AA167552 570.1 EST 36 123 H30072Highly similar to Peptide transporter PTR2, [Saccharomyces cerevisiae]570.2 EST 37 124 H30072 Highly similar to Peptide transporter PTR2,[Saccharomyces cerevisiae] 571.1 EST 38 N20163 572.1 EST 39 AF052172575.1 EST 40 125 R24284 577.1 Gene 41 126 J05158 CPN, Carboxypeptidase N579.1 EST 42 H20731 581.1 Gene 43 H23544 RAN, TC4, Ras-like protein581.2 Gene 44 H23544 RAN, TC4, Ras-like protein 583.1 EST 45 127 A006O16584.1 EST 46 H64839 586.1 EST 47 AA180186 587.1 EST 48 T50225 589.1 Gene49 128 AA025934 CAGH32 590.1 EST 50 N59820 592.1 Gene 51 129 AF045458.1ULK1, Homo sapiens serine/threonine kinase 594.1 EST 52 130 AA252357595.1 Gene 53 131 AF116238.1 PUS1, pseudouridine synthase 1 596.1 EST 54132 AA001424 601.1 EST 55 R50113 603.1 EST 56 H94865 604.1 EST 57A006R19 605.1 EST 58 133 N23648 similar to ZN91_HUMAN ZINC FINGERPROTEIN 91 605.2 EST 59 N23648 similar to ZN91_HUMAN ZINC FINGER PROTEIN92 606.1 EST 60 134 A004B47 Similar to DNA Polymerase epsilon, catalyticsubunit 608.1 EST 61 135 AB014592 611.1 EST 62 136 A005Q05 615.1 Gene 63137 D63997 GCP170, Golgi membrane protein 617.1 Gene 64 138 U09368ZNF140 618.1 EST 65 139 R44594 ZNF84 621.1 EST 66 140 R81342 ZNF10 622.1EST 67 H65839 690.1 EST 68 141 AA812723 692.1 EST 69 R24284 similar toreverse transcriptase homolog [H. sapiens] 693.1 EST 70 142 AA678190694.1 EST 71 AA897697 695.1 EST 72 AA705809 697.1 EST 73 AA889526 698.1GENE 74 143 AF104260.1 hiwi 699.1 GENE 75 144 N49217 SFRS8, Splicingfactor, arginine/serine-rich 8, (suppressor-of-white-apricot Drosophilahomolog) 702.1 GENE 76 145 X89576 MMP17, matrix metalloproteinase 17(membrane- inserted) 705.1 EST 77 146 AA846540 707.1 EST 78 147 AA223499707.2 EST 79 148 AA223499 722.1 GENE 80 149 D14582 EPIM,Epimorphin-isoform, Syntaxin family 722.2 GENE 81 D14582 EPIM,Epimorphin-isoform, Syntaxin family 748.1 EST 82 AA625844 749.1 EST 83AA969066 751.1 EST 84 150 AL162032 Similar to latrophilin-3 752.1 EST 85AI184706 753.1 EST 86 AL039191 754.1 EST 87 AI240327 755.1 EST 88 151AB031230 PCCX2 mRNA for protein containing CXXC domain 2 756.1 EST 89152 AB028999 757.1 Gene 90 153 AB027464 FZD10, Frizzled 10 835.1 EST 91154 AL136697 CABP1, Calcium binding protein 1 (calbrain) 848.1 EST 92155 AA214469 Gene Number Marker Genomic Seq 214.1 TH RP11-702C13,AC079031 214.2 TH RP11-702C13, AC079031 214.3 TH RP11-702C13, AC079031214.4 TH RP11-702C13, AC079031 214.5 TH RP11-702C13, AC079031 215.1sts-U79526 224.1 RK903904 266.1 SHGC11024 283.1 SGC35065 292.1 SGC34324298.1 WI-7628 321.1 A002Y44 399.1 sts-AA002185 399.2 sts-AA002185 399.3sts-AA002185 422.1 PLA2G1B AC003982, AC078926, AC073930 436.1 sts-Y07684AC069209, AC048337, AC011216, AC024642 436.2 sts-Y07684 AC069209,AC048337, AC011216, AC024642 454.1 stSG36007 Z98941, AC011216, AC069209,AC024642 515.1 WI-7227 536.1 A004O46 543.1 Bda03b10 548.1 stSG1936 549.1sts-AA017225 550.1 A004F14 550.2 A004F14 551.1 SGC31333 553.1 WI-12422555.1 stSG21539 559.1 ZNF26 561.1 WI-13120 AC063926, AC025837, AC090147,AC090565, AC024021, RP11-831E18 561.2 WI-13120 AC063926, AC025837,AC090147, AC090565, AC024021, RP11-831E18 562.1 stSG22703 566.1stSG15434 567.1 stSG30525 570.1 SGC30248 AC023437, RP11-666B20 570.2SGC30248 AC023437, RP11-666B20 571.1 sts-N20163 572.1 Cda0af01 575.1SGC34088 577.1 stSG16387 579.1 WI-15018 581.1 WI-16177 AC073912,RP11-899A17 581.2 WI-16177 AC073912, RP11-899A17 583.1 A006O16 584.1H64839 586.1 stSG30906 587.1 stSG26056 589.1 stSG1702 590.1 sts-N59820592.1 IB2452 594.1 AA252357 595.1 stSG4720 596.1 sts-AA001424 601.1R50113 603.1 sts-H94865 604.1 A006R19 605.1 SGC34278 605.2 SGC34278606.1 A004B47 608.1 stSG40199 611.1 A005Q05 615.1 CDA1JF08 617.1stSG31494 618.1 stSG40222 621.1 sts-R81342 622.1 sts-H65839 690.1stSG60065 692.1 WI-AF856 693.1 stSG52343 694.1 stSG54325 695.1 stSG63473697.1 stSG53307 698.1 stSG53541 AC025837, RP11-831E18, AC090147,AC090565 699.1 stSG43910 702.1 stsX89576 RP11-932D22, RP11-723P10 705.1stSG54842 707.1 stSG48386 707.2 stSG48386 722.1 B0700A09- AC073912,RP11-899A17 A2.x 722.2 B0700A09- AC073912, RP11-899A17 A2.x 748.1AC025641, AC079602, Z98941 749.1 751.1 AC073527, AC078925, AC073862752.1 753.1 754.1 755.1 756.1 757.1 AC026336, AC026869, AC026764 835.1848.1 AC026336, AC026869, AC026764

Example 7 cDNA Cloning

1. Construction and Screening of cDNA Libraries:

Directionally cloned cDNA libraries from normal lung and bronchialepithelium were constructed using standard methods (Soares et al., 1994,Automated DNA Sequencing and Analysis, Adams et al. (eds), AcademicPress, NY, pp. 110-114). Total and cytoplasmic RNAs were extracted fromtissue or cells by homogenizing the sample in the presence ofguanidinium thiocyanate-phenol-chloroform extraction buffer (e.g.,Chomczynski and Sacchi, 1987, Anal. Biochem. 162:156-159) using apolytron homogenizer (Brinkman Instruments, Inc.; Westbury, N.Y.). Poly(A)+ RNA was isolated from total/cytoplasmic RNA using dynabeads-dTaccording to the manufacturer's recommendations (Dynal Biotech; Norway).The double stranded cDNA was then ligated into the plasmid vectorpBluescript II KS+ (Stratagene; La Jolla, Calif.), and the ligationmixture was transformed into E. coli host DH10B or DH12S byelectroporation (Soares, 1994). Following overnight growth at 37° C.,DNA was recovered from the E. coli colonies after scraping the plates asdirected for the Mega-prep kit (QIAGEN). The quality of the cDNAlibraries was estimated by counting a portion of the total number ofprimary transformants, determining the average insert size, andcalculating the percentage of plasmids without cDNA insert. AdditionalcDNA libraries (human total brain, heart, kidney, leukocyte, and fetalbrain) were purchased from Life Technologies (Bethesda, Md.).

cDNA libraries, both oligo (dT) and random hexamer-primed, were used toisolate cDNA clones mapped within the disorder critical region. Four10×10 arrays of each of the cDNA libraries were prepared as follows. ThecDNA libraries were titered to 2.5×10⁶ using primary transformants. Theappropriate volume of frozen stock was used to inoculate 2 L ofLB/ampicillin (100 μg/μl). Four hundred aliquots containing 4 ml of theinoculated liquid culture were generated. Each tube contained about 5000cfu (colony forming units). The tubes were incubated at 30° C. overnightwith shaking until an OD of 0.7-0.9 was obtained. Frozen stocks wereprepared for each of the cultures by aliquotting 300 μl of culture and100 μl of 80% glycerol. Stocks were frozen in a dry ice/ethanol bath andstored at −70° C. DNA was isolated from the remaining culture using theQIAGEN spin mini-prep kit according to the manufacturer's instructions.The DNAs from the 400 cultures were pooled to make 80 column and rowpools. Markers were designed to amplify putative exons from candidategenes. Once a standard PCR condition was identified and specific cDNAlibraries were determined to contain cDNA clones of interest, themarkers were used to screen the arrayed library. Positive addressesindicating the presence of cDNA clones were confirmed by a second PCRusing the same markers.

Once a cDNA library was identified as likely to contain cDNA clonescorresponding to a transcript of interest from the disorder criticalregion, it was used to isolate a clone or clones containing cDNAinserts. This was accomplished by a modification of the standard “colonyscreening” method (Sambrook et al., 1989). Specifically, twenty 150 mmLB plus ampicillin agar plates were spread with 20,000 cfu of cDNAlibrary. Colonies were allowed to grow overnight at 37° C. Colonies werethen transferred to nylon filters (Hybond from Amersham-Pharmacia, orequivalent) and duplicates prepared by pressing two filters togetheressentially as described (Sambrook et al., 1989). The “master” plate wasthen incubated an additional 6-8 hr to allow the colonies additionaltime to grow. The DNA from the bacterial colonies was then bound to thenylon filters by incubating the filters with denaturing solution (0.5 NNaOH, 1.5 M NaCl) for 2 min, and neutralization solution (0.5 M Tris-ClpH 8.0, 1.5 M NaCl) for 2 min (twice). The bacterial colonies wereremoved from the filters by washing in a solution of 2×SSC/2% SDS for 1min while rubbing with tissue paper. The filters were air-dried andbaked under vacuum at 80° C. for 1-2 hr to crosslink the DNA to thefilters.

cDNA hybridization probes were prepared by random hexamer labeling(Fineberg and Vogelstein, 1983, Anal. Biochem. 132:6-13). For smallfragments, probes were prepared using gene-specific primers and omittingrandom hexamers in the reaction. The colony membranes were pre-washed in10 mM Tris-Cl pH 8.0, 1 M NaCl, 1 mM EDTA, and 0.1% SDS for 30 min at55° C. Following the pre-wash, the filters were pre-hybridized in morethan 2 ml/filter of 6×SSC, 50% deionized formamide, 2% SDS, 5×Denhardt'ssolution, and 100 mg/ml denatured salmon sperm DNA, at 42° C. for 30min. The filters were then transferred to hybridization solution (6×SSC,2% SDS, 5×Denhardt's, and 100 mg/ml denatured salmon sperm DNA)containing denatured α-³²P-dCTP-labeled cDNA probe, and incubatedovernight at 42° C.

The following morning, the filters were washed under constant agitationin 2×SSC/2% SDS at RT (room temperature) for 20 min, followed by twowashes at 65° C. for 15 min each. A final wash was performed in0.5×SSC/0.5% SDS for 15 min at 65° C. Filters were then wrapped inplastic wrap and exposed to radiographic film. Individual colonies onplates were aligned with the autoradiograph, and positive clones wereinoculated into a 1 ml solution of LB Broth containing ampicillin. Aftershaking at 37° C. for 1-2 hr, aliquots of the solution were plated on150 mm plates for secondary screening. Secondary screening was identicalto primary screening (above), except that it was performed on platescontaining ˜250 colonies, so that individual colonies could be clearlyidentified. Positive cDNA clones were characterized by restrictionendonuclease cleavage, PCR, and direct sequencing to confirm thesequence identity between the original probe and the isolated clone.

4. Gene Identification in Region 12q23-Qter by Direct cDNA Selection:

Direct cDNA selection is a powerful technique for the identification ofgenes mapping to a particular genomic interval. It involves hybridizinggenomic DNA (in this case, BACs) from a region of interest to pools ofcDNAs derived from various tissue sources. The procedure permits therapid isolation of cDNAs, and obviates the need for extensive screeningof cDNA libraries. The tissues used in this study included unstimulatedTh2 cells, Th2 cells stimulated with TPA, bronchial smooth muscle cells,unstimulated Th0 cells, Th0 stimulated with anti CD3 and TPA, pulmonaryartery endothelium cells, lung microvascular endothelial cells,bronchial epithelium cells, normal and asthmatic lung, small airwayepithelium cells, pulmonary artery smooth muscle cells, and lungfibroblasts. These cell types have been implicated in thepathophysiology of asthma and were expected to express genes involved inthe asthmatic inflammatory response. In addition, RNA isolated frombrain cells was used, because brain cells expresses a diverse array ofgenes.

Cytoplasmic RNA was isolated as described by Sambrook et al, 1989,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratories,Cold Spring Harbor, N.Y. Approximately 400-600 μg of cytoplasmic RNA wasisolated from 50 million cells. Total RNA was isolated from normal andasthmatic lung tissue using TRIzol Reagents (GibcoBRL), which areready-to-use monophasic solutions of guanadinium isothiocyanate andphenol (P. Chomczynski and N. Sacchi, 1987, Anal. Biochem. 162:156-159;P. Chomczynski et al., 1987, J. NIH Res. 6:83; D. Simms et al., 1993,Focus 15:99; P. Chomczynski, 1993, BioTechniques 15:532). Five hundredmilligrams of frozen tissue was crushed into a fine powder using aBessman tissue pulverizer (Fisher Scientific). The TRIzol Reagents weremixed with the crushed tissue according to the manufacturer'srecommendations.

To ascertain whether there was genomic DNA or heteronuclear RNAcontamination, PCR and RT/PCR were performed. PCR analysis was performedusing primers (Research Genetics) that amplified STS markers fromchromosomes 2 (D2S2358), 7 (D7S2776 and D7S685), 10 (D10S228 andD10S1755), and 20 (D20S905 and D20S95). All PCR reactions were performedin a final volume of 25 μl, containing 1 μl of RNA, 10 mM Tris-HCl (pH8.3), 50 mM KCl, 1.5 mM MgCl₂, 0.001% gelatin, 200 mM each dNTPs, 10 μMof each primer, and 1 U Taq DNA polymerase (Perkin Elmer). A PerkinElmer 9600 cycler was used for amplification as follows: 30 sec at 94°C., 30 sec at 55° C., and 30 sec at 72° C. for 30 cycles. RT/PCRanalysis was performed using the SuperScript One-Step RT-PCR System(Gibco-BRL, Rockville, Md.) according to the manufacturer'srecommendations. All PCR and RT/PCR products were evaluated byelectrophoresis on a 1% agarose gel.

Poly (A)+ RNA was prepared from the total RNA isolated from the humanprimary cells and lung tissues using Dynabeads Oligo(dT) according tothe manufacturer's recommendations (Dynal, Lake Success, N.Y.).Approximately 4 μg of messenger RNA was isolated from 150 μg of totalRNA for each cell type and tissue source. Total RNA isolated from braintissue was purchased from CLONTECH (Palo Alto, Calif.), and poly(A)+ RNAwas prepared from this material using Dynabeads Oligo(dT), describedabove. Oligo dT and random primed cDNA pools were generated from themRNA isolated from each cell type and tissue source. Briefly, 2.0 μgmRNA was mixed with oligo(dT) primer in one reaction. In anotherreaction, 2.0 μg mRNA was mixed with random hexamers, and converted todouble stranded complementary DNA using the SuperScript Choice Systemfor cDNA Synthesis (Gibco-BRL, Rockville, Md.) according to themanufacturer's recommendations.

Four different paired phosphorylated cDNA linkers (Table 5) wereannealed by mixing a 1:1 ratio of the paired linkers (10 μg each),incubating the mixture at 65° C. for 5 min, and allowing the mixture tocool to RT for 30 min. The annealed linkers were ligated to theoligo(dT) and random-primed cDNA pools from various tissue and cellsources (Table 5) according to manufacturer's instructions (GibcoBRL).The linker sequence provided a tag to identify the RNA from theparticular cell types.

TABLE 5 PAIRED LINKERS SEQ Paired ID linkers Sequence NO:Cell/Tissue Type OLIGO 3 5′CTC GAG AAT TCT GGA TCC TC3′ 5233Th2/unstimulated (dT + rp) OLIGO 4 5′TTG AGG ATC CAG AAT TCT CGA G3′5234 Th0/stimulated/anti CD3  (dT + rp) Pulmonary artery endothelium cells (dT + rp) Lung microvascular Endothelial cells (dT + rp)Bronchial epithelium cells  (dT + rp) OLIGO 55′TGT ATG CGA ATT CGC TGC GCG3′ 5235 Normal Lung (dT + rp) OLIGO 65′TTC GCG CAG CGA ATT CGC ATA CA3′ 5236 Athmatic lung (dT + rp)Th2/stimulated/TPA (dT + rp) Bronchial smooth muscle cells  (dT + rp)OLIGO 9 5′CCT ACG GAA TTC TCA CTC AGC3′ 5237 Brain (dT + rp) OLIGO 105′TTG CTG AGT GAG AAT TCC GTA GG3′ 5238 Th0/unstimulated (dT + rp)Pulmonary artery smooth   muscle cells (dT + rp) OLIGO 115′GAA TCC GAA TTC CTG GTC AGC3′ 5239 Lung fibroblasts (dT + rp) OLIGO 125′TTG CTG ACC AGG AAT TCG GAT TC3′ 5240 Th0/stimulated/ TPA (dT + rp)Small airway epithelium cells  (dT + rp)

The cDNA pools were evaluated for length distribution by PCRamplification using 1 μl of a 1:1, 1:10, and 1:100 dilution of theligation reaction. All PCR reactions were performed in a final volume of25 μl, containing 1 μl of DNA, 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 1.5mM MgCl₂, 0.001% gelatin, 200 mM of each dNTP, 10 μM of each primer, and1 U Taq DNA polymerase (Perkin Elmer). A Perkin Elmer 9600 cycler wasused to for amplification as follows: 30 seconds at 94° C., 30 secondsat 55° C., and 2 minutes at 72° C. for 30 cycles. The lengthdistribution of the amplified cDNA pools was evaluated byelectrophoresis on a 1% agarose gel. The PCR reaction that gave the bestrepresentation of the random primed and oligo dT primed cDNA pools wasscaled-up to yield ˜2-3 μg of each cDNA pool. This represented a 1×PCRreaction for the starting cDNA pools.

Twenty BACs (Table 6) that spanned the 15 cM critical region betweenmarkers D12S1609 and D12S357 were pooled in equimolar amounts. Onemicrogram of the isolated genomic DNA was labeled with biotin 16-UTP bynick translation in accordance with the manufacturer's instructions(Boehringer-Mannheim). The incorporation of biotin was monitored bystandard methods (Del Mastro and Lovett, 1996, Methods in MolecularBiology, Humana Press Inc., NJ).

TABLE 6 BACs SPANNING THE 15 cM REGION 0753B07 0666B20 0687F10 0820N160899A17 0716I10 0839D11 0894M06 0696L08 0979G13 0723P10 0932D22 0825K210866B05 0750I23 0831E18 0761L21 0702C13 0739N03 1064I09

Direct cDNA selection was performed using standard methods (Del Mastroand Lovett, 1996, Methods in Molecular Biology, Humana Press Inc., NJ).Briefly, 1 μg of each cDNA pool was placed into individual PCR tubes. Atotal of 30 direct selection experiments were arrayed onto a PCR plate.Suppression of high copy repeats, ribosomal RNA, and plasmid DNA in thecDNA pools was performed to a Cot₂₀. One hundred nanograms ofbiotinylated BAC DNA was mixed with the suppressed cDNAs, and hybridizedin solution to a Cot₂₀₀. The biotinylated DNA and the cognate cDNAs werethen captured on streptavidin-coated paramagnetic beads. The beads werewashed and the primary selected cDNAs were eluted. The products from thefirst round of direct selection were PCR amplified using appropriateprimers (shown in Table 5), and a second round of direct selection wasperformed.

GTP-Binding Nuclear Protein RAN (TC4, a gene that maps within the 7.6 cMcritical region) was used to monitor the enrichment during the tworounds of direct selection. The enrichment of the TC4 was monitored inthe starting, primary, and secondary selected material of the fifteenoligo dT and random primed cDNA pools. The random primed product of thesecond round of direct selection (the secondary selected material) fromlung microvascular endothelial cells, Th0/unstimulated cells, lungfibroblast cells, Th2/unstimulated cells, pulmonary artery endotheliumcells, normal lung, small airway epithelium cells, bronchial epitheliumcells, Th0 cells stimulated with TPA, and oligo dT primed Th0 cellsstimulated with TPA was PCR-amplified with modified primers (Table 7,below). These primers were used for two rounds of direct cDNA selection.

TABLE 7 MODIFIED OLIGONUCLEOTIDES Modified SEQ Oligonu- ID cleotides NOSequence OLIGO 3 5241 5′ CUA CUA CUA CUA CTC GAG  AAT TCT GGA TCC TC 3′OLIGO 5 5242 5′ CUA CUA CUACUATGT ATG  CGA ATT CGC TGC GCG 3′ OLIGO 95243 5′ CUA CUA CUA CUA CCT ACG  GAA TTC TCA CTC AGC 3′ OLIGO 11 5244 5′CUA CUA CUA CUA GAA TCC  GAA TTC CTG GTC AGC 3′

The amplified material was cloned into the UDG vector pAMP10 (GibcoBRL)in accordance with the manufacturer's recommendations. Four hundred andeighty clones were picked from each transformed source and arrayed intofive 96-well microtiter plate. Each selected cDNA library was stamped,in duplicate, in high density format onto Hybond N+ nylon membrane(Amersham). The bacteria were grown overnight at 37° C., and themembranes were processed as recommended by the manufacturer.

To identify which of the clones represented common contaminants (e.g.,high copy repeats and ribosomal RNA), a radiolabeled probe containing 1μg of Cot₁ DNA and 0.5 μg ribosomal DNA was hybridized at 65° C. to thehigh density filters (Sambrook et al, 1989, Molecular Cloning: ALaboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor,N.Y.). The filters were washed three times in buffer (0.1×SSC/0.1% SDS)at 65° C., and were autoradiographed. Those cDNAs that showed duplicatesignals were scored as background contaminants. The remaining cloneswere re-arrayed into 96-well microtiter plates. A total of twenty-three96-well microtiter plates containing 2208 secondary selected clones weresequenced. This included three 96-well microtiter plates from all therandom primed selections. Except, only two plates were included for theTh0 cells stimulated with TPA, and only one plate was included for theTh0 cells stimulated with TPA from the oligo dT selection. All cDNAclones were sequenced using M13 dye primer terminator cycle sequencingkits (Applied Biosystems). Data was collected by the ABI 377 automatedfluorescence sequencer (Applied Biosystems).

Clones representing other contaminants, such as high copy repeats,ribosomal RNA, plasmid DNA, mitochondrial DNA, and E. coli and yeast DNAthat were not identified in the hybridization process were removed fromthe dataset using in silico methods. This produced a set of cDNA clonescorresponding to SEQ ID NO:980 to SEQ ID NO:1766, disclosed herein.These clones were clustered using PANGEA System's EST Clustering Tool(Oakland, Calif.), and analyzed with BLASTN, BLASTX, and FASTA programs.This allowed the assembly of full-length gene sequences. The directselected clones were combined with the ESTs homologous to BAC sequences,BAC end sequences, and sequence within the public domain (dbEST andGenBank), and then clustered using the PANGEA Systems EST ClusteringTool. The clustered sequences (i.e., consensus sequences) correspond toSEQ ID NO:1767 to SEQ ID NO:4687, disclosed herein. In silico andhybridization techniques were used to map the direct selected cDNAs tothe 15 cM region. Using well-established sequencing techniques, oneskilled in the art could extend these candidate clones to map back theregion into a full-length gene.

Example 8 Expression Analysis

In order to characterize the expression of genes mapping to the12q23-qter region, a series of experiments were performed. First,oligonucleotide primers were designed for PCR and RT-PCR reactions toamplify cDNA, EST, or genomic DNA could be amplified from a pool of DNAmolecules or RNA population. The PCR primers were used in a reactioncontaining genomic DNA to verify that they generated a product of thepredicted size, based on the genomic sequence. The length, innucleotides, of the processed transcript or messenger RNA (mRNA) wasdetermined by Northern analysis (Sambrook et al, 1989, MolecularCloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold SpringHarbor N.Y.). Probes were generated using one of the methods describedbelow.

Briefly, sequence verified IMAGE consortium cDNA clones were digestedwith appropriate restriction endonucleases to release the insert. Therestriction digest was electrophoresed on an agarose gel and the bandscontaining the insert were excised. The gel piece containing the DNAinsert was placed in a Spin-X (Corning Costar Corporation, Cambridge,Mass.) or Supelco spin column (Supelco Park, Pa.) and spun at high speedfor 15 min. The DNA was ethanol precipitated and resuspended in TE.Alternatively, PCR products obtained from genomic DNA or RT-PCR werepurified as described above. Inserts purified from IMAGE clones wererandom primer labeled (Feinberg and Vogelstein) to generate probes forhybridization. Probes from purified PCR products were generated byincorporation of α-³²P-dCTP in second round of PCR. Commerciallyavailable Multiple Tissue Northern blots (CLONTECH, Palo Alto, Calif.)were hybridized and washed under conditions recommended by themanufacturer.

FIGS. 6A-6U show Northern blots illustrating the expression of theindicated genes in various tissues. With the exception of Gene 214 (FIG.6A), all blots were Multiple Tissue Northern Blots (CLONTECH, Palo Alto,Calif.). The tissues included: 1) brain; 2) heart; 3) skeletal muscle;4) colon; 5) thymus; 6) spleen; 7) kidney; 8) liver; 9) small intestine;10) placenta; 11) lung; and 12) peripheral blood leukocytes. Sizestandards (Kb) are indicated to the left of each blot. FIG. 6A shows theNorthern blot for Gene 214, which includes poly (A)+ selected RNAfrom 1) a lymphoblast cell line from an asthmatic individual; 2) lung;and 3) trachea.

RT-PCR was used as an alternate method to Northern blotting to detectmRNAs with low levels of expression. Total RNA from multiple humantissues was purchased from CLONTECH (Palo Alto, Calif.), and genomic DNAwas removed by DNaseI digestion. The Superscript′ PreamplificationSystem for First strand cDNA synthesis (Life Technologies, Gaithersburg,Md.) was used according to manufacturer's directions, with oligo(dT) orrandom hexamers to synthesize cDNA from the DNaseI treated total RNA.Gene specific primers were used to amplify the target cDNAs in a 30 μlPCR reaction containing 0.5 μl of first strand cDNA, 1 μl sense primer(10 μM), 1 μl antisense primer (10 μM), 3 μl dNTPs (2 mM), 1.2 μl MgCl₂(25 mM), 3 μl 10×PCR buffer, and 1 U Taq Polymerase (Perkin Elmer). ThePCR reaction included a denaturation step at 94° C. for 4 min, followedby 30 cycles at 94° C. for 30 sec, 58° C. for 1 min, and 72° C. for 1min, and an extention step at 72° C. for 7 min. PCR products wereanalyzed on agarose gels.

The 12q23-qter genes are shown in Table 4; the nucleotide sequencescorrespond to SEQ ID NO:1 to SEQ ID NO:92, the encoded amino acidsequences correspond to SEQ ID NO:93-155, and the BAC nucleotidesequences correspond to SEQ ID NO:694 to SEQ ID NO:979, as disclosedherein.

Example 9 Mutation Analysis

In order to conduct mutation analysis, the genomic structure of Gene214, Gene 224, Gene 422, Gene 436, Gene 449, Gene 454, Gene 515, Gene561, Gene 570, Gene 581, Gene 698, Gene 702, Gene 722, Gene 748, Gene751, Gene 757 and Gene 848 was determined. For genes with previouslyunidentified exon-intron boundaries, the cDNA sequences were compared togenomic sequence from the BACs. The precise intron-exon junctions weredetermined based on the consensus sequences at splice junctions. Theexon prediction programs MZEF (Zhang, 1997, Proc. Natl. Acad. Sci.,94:565-568) and GenScan (Burge and Karlin, 1997, J. Mol. Biol.,268:78-94) were also utilized to identify the exons.

Disorder associated candidate genes (Table 4) were identified using theabove procedures, and exons from these genes were subjected to mutationdetection analysis. A combination of fluorescent single strandedconfirmation (SSCP) analysis (ABI), DNA sequencing, and other sequenceanalysis methods described herein were utilized to precisely identifyand determine nucleotide sequence variants. SSCP analysis was used toscreen individual DNA sequences for variants. Briefly, PCR was used togenerate templates from unrelated asthmatic individuals that showedincreased sharing for the 12q23-qter chromosomal region, and contributedtowards linkage. Non-asthmatic individuals were used as controls.Enzymatic amplification of genes within the asthma region of 12q23-qterwas accomplished using primers flanking each exon and the putative 5′regulatory elements of each gene. The primers were designed to amplifyeach exon, as well as 15 or more base pairs of each intron on eitherside of the splice site. The forward and the reverse primers had twodifferent dye colors to allow analysis of each strand, and independentconfirmation of variants. PCR reactions were optimized for each exonprimer pair. Buffer and cycling conditions were specific to each primerset. PCR products were denatured using a formamide dye, andelectrophoresed on non-denaturing acrylamide gels with varyingconcentrations of glycerol (at least two different glycerolconcentrations).

Primers utilized in fluorescent SSCP experiments to screen coding andnon-coding regions of Gene 214, Gene 224, Gene 422, Gene 436, Gene 449,Gene 454, Gene 515, Gene 561, Gene 570, Gene 581, Gene 698, Gene 702,Gene 722, Gene 748, Gene 751, Gene 757 and Gene 848 for polymorphismsare provided in Table 8. Column 1 lists the genes targeted for mutationanalysis. Column 2 lists the specific exons analyzed. Column 3 lists theassigned primer names. Columns 4 and 5 list the forward primer sequencesand the reverse primer sequences, respectively. The genes listed incolumn 1 of Table 8 correspond to the gene identifiers in column 1 ofTable 4.

TABLE 8 SSCP PRIMERS Primers used in SSCP experiments SEQ SEQ ID ID GeneExon SSCP Assay NO: Forward Sequence NO: Reverse Sequence 454 A55_454_A_F_56_454_A_R 5245 TGGCCCTGTCAGGAAGAGTA 5271CTGCAGAGATCTGGGTCCTC 454 B 57_454_B_F_58_454_B_R 5246TTGATGCTTTCCCATGTCTG 5272 GGAGAATGCTACGAGGTGCT 454 C59_454_C_F_60_454_C_R 5247 TCAAAGGCCTTGCATTTTCT 5273GTCCGCATTTCTGCTTCTTC 454 D 61_454_D_F_62_454_D_R 5248TCCCCACTCTGTCATCCTTC 5274 GAGGCTGAAGACCTGACCTG 454 E63_454_E_F_64_454_E_R 5249 CCTCTCCGCAGTTCTTTCAC 5275GAGGGCCACTGTGTCTGTCT 454 F 65_454_F_F_66_454_F_R 5250GTATCCCAAAGACCAAGCCA 5276 AACTAAGACAGCCAGGCAGC 454 G67_454_G_F_68_454_G_R 5251 ATGGAACCTCTCCACCACAC 5277TCCCAGTGTACAAAGCACCA 454 H 69_454_H_F_70_454_H_R 5252CTGGCTATGCAGGGAGATGT 5278 GTGAGTTTGACCTGGGCCT 454 K71_454_K_F_72_454_K_R 5253 CCAGAACCCAGCACTTTCA 5279 AGGCTGAGACCAAAACCCTT454 L 73_454_L_F_74_454_L_R 5254 AACCAACAATTGCACGTTGA 5280TGTCGATGAGGAAGTCGATG 454 M 75_454_M_F_76_454_M_R 5255CAGCGCTTGTCTGCATTCT 5281 GGAATCTCTCCGTGTCTTGG 454 N77_454_N_F_78_454_N_R 5256 TGATAATTCTGTACAAAAATGGGTAA 5282CTTTGTTAAAATCCATCAGTTTTG 454 O1 79_454_O_F_80_454_O_R 5257CCTAGAACCTGAGGGCTTGTC 5283 CTGTGGCTCTCAGGGAGTTG 454 O281_454_O_F_82_454_O_R 5258 GGTGCCAGTGTGGAAGATG 5284 AGGTGGCGTAGCACCTGTAG454 O3 83_454_O_F_84_454_O_R 5259 CACCACCTCAGAGCTGTTCA 5285ACTGCCCTTCACTCTTTGGA 454 O4 85_454_O_F_86_454_O_R 5260CCAGGACATGGCTGACTTTG 5286 ACAGACAGGATTTCGCCTTG 454 AA1959_454_AA_F_1960_454_AA_R 5261 GAAATATTCCAATTTTGCCTGG 5287CCGAGGAAAGTGGAGTTGAG 454 AA 1961_454_AA_F_1962_454_AA_R 5262CCTGTTTTGCTTTGAGTCCA 5288 TACTCTCCACCCTCCTCTGC 454 AA1963_454_AA_F_1964_454_AA_R 5263 CCTGGTGATCTTTGGCTGAT 5289ACAACCCTTTATTCAGCCCC 454 AA 1965_454_AA_F_1966_454_AA_R 5264GGGGAGATCTTCATTTACCCA 5290 GTGTTCAGAGGATGGGCATT 454 AA1967_454_AA_F_1968_454_AA_R 5265 GGGGAAAAGGGAGAATTCTAAA 5291CCCTCCCAGTAACTGCAAAA 454 AA 1969_454_AA_F_1970_454_AA_R 5266GCAGTCATTGGAGGAGCTTG 5292 GGAAAAGATGATCACGTGGAA 757 A1750_757_A_F_1751_757_A_R 5267 GAGCAGGGGTGGAGAGCC 5293CAGGTTGGGCATACGAGTCA 757 A 1752_757_A_F_1753_757_A_R 5268GCAAGGACATCGGCTACAA 5294 ATAATCGGGGAGCACTTGAG 757 A1778_757_A_F_1779_757_A_R 5269 TGCACCGAGCAGGTCTCTAC 5295GTCCTTCAGCGGGTGCTC 757 A 1780_757_A_F_1781_757_A_R 5270AACTACCTGTGCATGGAGGC 5296 GAAGGTGAGCACGGTGAAG 757 A1758_757_A_F_1759_757_A_R 5297 CGTGCTCACCTTCCTCATC 5330GTGAGGACCACCCACCAC 757 A 1760_757_A_F_1761_757_A_R 5298CTGTGGTGGGTGGTCCTC 5331 GTAGCAGGCCAGGGGAAT 757 A1782_757_A_F_1783_757_A_R 5299 TCTGCTACGTGGGCAGCAT 5332CCATGTTGAGGCGTTCGTAA 757 A 1784_757_A_F_1785_757_A_R 5300CTCTGTGCTGTACACCGTGC 5333 GGTTTTCTCCGGCTCTTCTT 757 A1786_757_A_F_1787_757_A_R 5301 CCTCCAAGACTCTGCAGTCC 5334CACAACCAAGAAAAGCACCA 757 A 1788_757_A_F_1789_757_A_R 5302AAATATGAGATCCCTGCCCA 5335 CTTCGCTGGAAAACCAAAAC 757 A1768_757_A_F_1769_757_A_R 5303 TGAAATTCAGGATGCTGTGA 5336TTGCAAAGCAGTTATCTGTCC 757 A 1770_757_A_F_1771_757_A_R 5304TTGAGTTGGCTTTGCTACCC 5337 TGTGAGGTTTGATGGAGGTTT 757 A1772_757_A_F_1773_757_A_R 5305 CTGCAAGACAGAAACCTCCA 5338TCCACAAATCAGTCCAAACG 757 A 1774_757_A_F_1775_757_A_R 5306TAATGGAAACCAAGCCAATG 5339 CAAATATACACACGCAGAAACC 757 A1776_757_A_F_1777_757_A_R 5307 TGCCAGGAAAGAGTGGTTTC 5340GCTAGAAGCACAACCCCAGA 561 A 1530_561_A_F_1531_561_A_R 5308AGGGTATAGGATGCACGCC 5341 CTCCACCACACCAGGGAT 561 B937_561_B_F_938_561_B_R 5309 ACACACATTTCCACCACCAA 5342CATGAACTGTGGGAAAGGCT 561 B 939_561_B_F_940_561_B_R 5310CCGGACTCAAAGTGAGCAGT 5343 ATTTCACCTGTGCACACCCT 561 C941_561_C_F_942_561_C_R 5311 CATGACCAACGTGCTTTGAC 5344ATCTTGCGCTACCGGATCT 561 C 943_561_C_F_944_561_C_R 5312GTCAGGAGAGCGCTATTGGA 5345 AACAGGACAAACTGGCCAAC 561 D945_561_D_F_946_561_D_R 5313 CCTCCAGCTTCAATAACCCA 5346AAATCCCACCTTCTCCTCGT 561 E 947_561_E_F_948_561_E_R 5314TGTGTCCTCCAGAGCCTCTAA 5347 GGGAGCCCTGCCTATCTATC 561 F949_561_F_F_950_561_F_R 5315 CTGTGTTGGCTGGGTGATAA 5348GGCACTGTTGTCGGTGATG 561 F 951_561_F_F_952_561_F_R 5316GAGAGCACATCCTGGACCTC 5349 TTCATGCGTGTCTCCTTGTC 561 F953_561_F_F_954_561_F_R 5317 GCCACCAGGATGGGGAAC 5350 TCTGCGTGATGTTGTCCAC561 F 955_561_F_F_956_561_F_R 5318 GTGGGCAAGGACGTGGTG 5351CTCCCTTTGCTCCAGCGG 561 F 957_561_F_F_958_561_F_R 5319CACGTCATCTTCCTCAACGA 5352 GGAAGGACACAGGGCTCAC 561 G1532_561_G_F_1533_561_G_R 5320 ACCGAATGATCTCGTTTCCA 5353AAAACTCACCCTCTGCCCTT 561 G 1534_561_G_F_1535_561_G_R 5321CACCCCCACAAGATGTTACC 5354 AGTGATCAGGGCTGGAAGAG 561 H961_561_H_F_962_561_H_R 5322 GGCTCCCCATTGCAGGAC 5355 TGATTGGGGTGCAGGTCTC561 H 963_561_H_F_964_561_H_R 5323 ACTCTGCAGTTGCTGCCGT 5356CTGTGGCTGTGGCAGGAT 561 H 1536_561_H_F_1537_561_H_R 5324CACGCCAGGATGGATGAG 5357 GACTGAGGAGCCACCGAG 561 I 967_561_I_F_968_561_I_R5325 GTAGCTGAAGGTGGCCCTG 5358 CCACCAGGAGGATGGTGT 561 J969_561_J_F_970_561_J_R 5326 TGTAGGATGCGGGAGGAG 5359 AGCTACTCTGGGGACGGAG561 K 971_561_K_F_972_561_K_R 5327 ATGCTGGCGAGACTTACGAC 5360TTTGCTTAGCGGAAAATGCT 561 L 973_561_L_F_974_561_L_R 5328CACGCTCCTCAGTTAGGCTC 5361 CACCTTGATGATCTGGCCTT 561 L975_561_L_F_976_561_L_R 5329 AGACCGCCTTTCTCCAGACT 5362GTCGATACCCTGTTGCCAGT 561 M 977_561_M_F_978_561_M_R 5363CTGAACCAATCAATTACAGTGCT 5396 GATAAAATGCACAGGGAAGGTC 561 N979_561_N_F_980_561_N_R 5364 AGGGGAACACCGCTAAGTTT 5397GTGGTGTACCACGAGGGAAG 561 O 1538_561_O_F_1539_561_O_R 5365TTCTCAAATAGTAAGGGAAAGCA 5398 ATGACGTTCATGCCCAATTT 561 P983_561_P_F_984_561_P_R 5366 TCCTTTAGCCAAAGCAAGATG 5399ATATGGCAGAACGGGACAGA 561 Q 1248_561_Q_F_1249_561_Q_R 5367CCAAGGGCTTCTCAAGCATA 5400 ACACTGGCCCGGTTAAGGTA 561 X1744_561_X_F_1745_561_X_R 5368 GCCCCTAACTGATACAGAGGAA 5401AAGGAGGCAGACAAGCAAAA 561 Y 1746_561_Y_F_1747_561_Y_R 5369GGAGCTCCTAACCACTGCAC 5402 CTTCCCAGTTGTTCCTCCCT 561 Z1748_561_Z_F_1749_561_Z_R 5370 AGAGGAAGCAACGGATACCA 5403TCACACCGACCTCACAAAGA 561 R 1957_561_R_F_1958_561_R_R 5371ACCTGCCACGATAGCACAG 5404 ATAGGTGAGGAGAACGTGGC 214 B192_214_B_F_193_214_B_R 5372 CACTGTGTTAAAACGCCTGG 5405GTTGGGATTACAGGCACGAG 214 B 194_214_B_F_195_214_B_R 5373CAGAAGCAACCCACATGACC 5406 ACTACAGGTTTGCACCACCA 214 A196_214_A_F_197_214_A_R 5374 GCCCTTAGGGAGAGCAGC 5407CCACATCGTGCCTTTGTGTA 214 C 626_214_C_F_627_214_C_R 5375ATGCTCTCCTGATGGCTCCT 5408 AGGGAATGCAGGTGCAAAG 214 C628_214_C_F_629_214_C_R 5376 ACTCGGGAAAGGAAGGCTCT 5409CATACCTTGAGTGCACACCG 214 AA 1607_214_AA_F_1608_214_AA_R 5377AGACAGTGTTGTTCCCGGAG 5410 TCACTGCTCACCCACGTTAG 214 E1609_214_E_F_1610_214_E_R 5378 ATATGTTTGCTGGCTTTGGG 5411GAAGGAGTGAGCCGGTAACA 214 E 1611_214_E_F_1612_214_E_R 5379CTGCTTCAAGATGCCAGTGA 5412 AACAAACGCCTGGGTTGAG 214 E1613_214_E_F_1614_214_E_R 5380 CCGTCCCAGGATACCTTTTC 5413CCCAGGCTGTGTGTCCTCTA 214 E 1615_214_E_F_1616_214_E_R 5381ACACCCATCACCTTACATGG 5414 AATGAACGTGGTGACTACAGC 214 E1617_214_E_F_1618_214_E_R 5382 TATCTGGACGTGGTGGTGC 5415AGCAGAGTGAACAGTGGCTG 214 AA 1599_214_AA_F_1600_214_AA_R 5383CGGGCGTGTATATCTCTTCA 5416 TTCGCTTGTGATCATGTCG 214 AA1601_214_AA_F_1602_214_AA_R 5384 TGTACGAACAGTCCAGACGAG 5417GCCATGGTTGTTAAATTAGGC 214 AA 1603_214_AA_F_1604_214_AA_R 5385CGACATGATCACAAGCGAAA 5418 TTTGGTCTGCTTCAGTGGTG 214 AA1605_214_AA_F_1606_214_AA_R 5386 CGAATAAAGGCGTCGAGAAG 5419CAGGGTCCTCTTCAGAGTCG 224 W 133_224_W_F_134_224_W_R 5387CACCTGTCACCTGCCTTGTA 5420 GGGACCCACCTTGCTGAG 224 BB1432_224_BB_F_1433_224_BB_R 5388 CCCAGCCCCTTCTCACTG 5421GGAAAAGGGACCTGGGAAGT 224 C 1434_224_C_F_1435_224_C_R 5389CAGCAAGTCCCTCCTGATGT 5422 TTTAGCTTCCCTCCCCTCAG 224 D1436_224_D_F_1437_224_D_R 5390 GCAGATCCCAGGAAGAACAA 5423AGCTGCCACCCTCTCATCTA 224 J 1438_224_J_F_1439_224_J_R 5391TGTGGGGTACAGTGGCATTA 5424 GCAAACCCACTCACCCTCT 224 L1440_224_L_F_1441_224_L_R 5392 ATCCAGAGATACCCCAGCCT 5425CAAAGGTGGTTTCTGGCAGT 224 Y 1442_224_Y_F_1443_224_Y_R 5393GCCTGTGGGTATTTTGCACT 5426 ACCTACCCCAACTTGTGACG 224 Z1444_224_Z_F_1445_224_Z_R 5394 TTGATTGGATTTGAGCTCTGC 5427CCGTGGAGAGACACCTTCAC 224 S 131_224_S_F_132_224_S_R 5395TTGGCAGACAGAAGAGGAGG 5428 TTTCCTGTAGGTCCATGAG 422 C1859_422_C_F_1860_422_C_R 5429 TTATCTGGGCAGGGTTGTGT 5462CCCATTCCAGAGGAGTGAGA 422 D 1861_422_D_F_1862_422_D_R 5430CTGGCAGACCGATTTGAACT 5463 GGCAGGCACTCCAATTTTC 422 E1863_422_E_F_1864_422_E_R 5431 GTGAGGGCTGACCTATTGCT 5464CGGCCTACTGAGAACCAACT 422 F 1865_422_F_F_1866_422_F_R 5432TTCTTCTTGCCCCAGATTGT 5465 TGAGATGAGGCAGATAGAGGTG 422 F1867_422_F_F_1868_422_F_R 5433 AAGGCACACAAGAACCTGGA 5466AGGTGGCATCACTGCACTC 436 A 1549_436_A_F_1550_436_A_R 5434CCTAGAGGGTCATCGTTCCC 5467 TCGTACTCGAACAGGAAGGC 436 A1551_436_A_F_1552_436_A_R 5435 ACCCAGACCGACTAGGGGAC 5468GACCGAGGCCAGGATGAG 436 B 1553_436_B_F_1554_436_B_R 5436TTCCCCATCAATTCAAATCC 5469 TCAGGCCACGTCAATCATTA 436 C1555_436_C_F_1556_436_C_R 5437 TTTCTTGGCTCTCCGTGAGT 5470GAGCGAAAAGAAAGTCCACG 436 D 1557_436_D_F_1558_436_D_R 5438GCCACGTGGACTTTCTTTTC 5471 GGGTCATGTGAAGGAATTGG 436 E1559_436_E_F_1560_436_E_R 5439 TAGGAGACCCCTGTGGACAT 5472TGAGGCACAGAAAATCACTTG 436 F 1561_436_F_F 1562_436_F_R 5440CTGCACTCGAGGTGACAGAG 5473 ACACCTGGCCACCACTTACT 436 G1563_436_G_F_1564_436_G_R 5441 TCTCTGAGGTTTTCGTCGCT 5474GGGATGAGCAGCAGAGACAC 436 H 1565_436_H_F_1566_436_H_R 5442CAGGTGCTGAGGAAAGCCT 5475 TGCCTGAGTGCTGGTCTTC 436 I1567_436_I_F_1568_436_I_R 5443 TGTGCCAGCTCCACTCTAAC 5476ATGTCAAATTTCCCTGCCTG 436 J 1569_436_J_F_1570_436_J_R 5444GCCCCTGCAGAAACACTTT 5477 GGTCTTGGAGAAGGGAAGGT 436 K1571_436_K_F_1572_436_K_R 5445 CCATTCCGGTAAAGATTCCA 5478ACACCCAAGAGATGAGAGGC 436 L 1573_436_L_F_1574_436_L_R 5446CTACTTCAGTGCACCTTGCG 5479 ATTTCTCTGGGGTGATGTGG 436 M1671_436_M_F_1672_436_M_R 5447 CCATCAGTGTGCTGAGTGCT 5480ACAGGCCTCTTAAATTGCCA 449 A 1971_449_A_F_1972_449_A_R 5448CCAGATATTCCAGCCTCAGC 5481 ATCAGTGCCATCTCTGTCCC 449 A1973_449_A_F_1974_449_A_R 5449 CTGGGTAGGAGCCTGGCTAT 5482AAATGCTCCTGCCTCAGAAA 449 A 1975_449_A_F_1976_449_A_R 5450GGAAGAGGTGCTAGACGCTG 5483 GCTAGGTGGGATGGGGTATT 449 B1977_449_B_F_1978_449_B_R 5451 AGTGGGCCTCAGGGTGAC 5484TCTCTGCTCCATCCTCAGGT 449 B 1979_449_B_F_1980_449_B_R 5452ATGTGGCAAAGCCAGGAC 5485 CCCCAAGCATAGGACACAGA 449 C1981_449_C_F_1982_449_C_R 5453 TCAATCCCCAATCTCTTCCT 5486CTCTTCCCTCTCCTTGCC 449 D 1983_449_D_F_1984_449_D_R 5454CAACGCCATCCTTACACAGA 5487 TGTGGAGTGTGTAGTACTTGGTCC 449 D1985_449_D_F_1986_449_D_R 5455 ACTGTGATGGACCTGCTCCT 5488TGTGTTGGTGTGGGAGGTC 449 E 1987_449_E_F_1988_449_E_R 5456CAAACCATTATGAGCCTGGG 5489 GTCGTTCTGACCTTCAAGCC 449 F1989_449_F_F_1990_449_F_R 5457 TGTGGACTTAACACCTCTCCTTC 5490TGAGTGTGGGAGAAGATCCC 449 F 1991_449_F_F_1992_449_F_R 5458GCTCCTTAGCCAAATATGGGA 5491 ATAGATCCCCAGACCCAACC 449 F1993_449_F_F_1994_449_F_R 5459 ATTCCAAGGCCAAGTCCTG 5492TCTGGCCTGGGATAACTCAT 449 F 2011_449_F_F_1992_449_F_R 5460CAGGTGCTCCTTAGCCAAATA 5493 ATAGATCCCCAGACCCAACC 515 A1226_515_A_F_1227_515_A_R 5461 GCTCCATCGGACTCACTAGC 5494TGGATTTCCAGGACTTGAGG 515 A 1228_515_A_F_1229_515_A_R 5495TGTTGGGGCTGGAGTTTATC 5528 TCATGGCAAACATGAAGAGC 515 A1230_515_A_F_1231_515_A_R 5496 GCCGTTCGTGATGGACTACT 5529GCCATTCTGGATCAGCAACT 515 A 1232_515_A_F_1233_515_A_R 5497CAGCCATCATCTCTTGCCTT 5530 CCACCATGATGAAGGTGATG 515 A1234_515_A_F_1235_515_A_R 5498 GCATCATCCTGTTCTGCTCA 5531TGATAAAGAACGCCAGGTCC 515 A 1236_515_A_F_1237_515_A_R 5499GGCCATCGTCTTTGTCATCT 5532 GCTCGTGCTGCGGTTATTAT 515 A1238_515_A_F_1239_515_A_R 5500 ACTTCTCCAGCCCATCCTTT 5533GCAACAGCCCAACTGTTTCT 515 A 1240_515_A_F_1241_515_A_R 5501CATGGAGCCCCTCTTATCTG 5534 GCAACCAGTCTCCCACTCAT 570 C1310_570_C_F_1311_570_C_R 5502 GGTTTTCATCCTTGAAGACTGT 5535CCACAGAGGAAGACCACAA 570 C 1312_570_C_F_1313_570_C_R 5503TAGGCGGCATTGCCTATATT 5536 ACCTTTCAAACAGCCCAAGA 570 D1314_570_D_F_1315_570_D_R 5504 TGAGCTGGTTTCTTACCTCCA 5537CAAAGCCAAGAAAACAGGGA 570 D 1316_570_D_F_1317_570_D_R 5505AGGCATTGGAGTCTTTCAGC 5538 AAATGGCCAAAACAAGTGCT 570 E1318_570_E_F_1319_570_E_R 5506 GAGAGCACAGTTGGTCCACA 5539ACAATGCTTTTGTGTCGGTG 570 F 1320_570_F_F_1321_570_F_R 5507CCTGTATTGCGGGGAGTAAA 5540 TCTGAATCCACAACTGCTGC 570 G1322_570_G_F_1323_570_G_R 5508 CGAAGTCTCGTAGCCAACATC 5541GTGCCTGGACTCAGACACCT 570 H 1324_570_H_F_1325_570_H_R 5509CCATGTGTTAAAGTGCCCCT 5542 CCCCTCACTGGCTATTTTCA 570 I1326_570_I_F_1327_570_I_R 5510 GCTTGCATCACTGTGTTTCC 5543AGAAAGGGAAGCTTGGGGTA 570 I 1516_570_I_F_1517_570_I_R 5511GGGACGTCCTTGACAGACA 5544 TGGAGCTGTTTTTGTGCATC 570 J1330_570_J_F_1331_570_J_R 5512 AAAATACCTGTAGCAGCGCA 5545ATTGGCTCTTGATCGCTGA 570 J 1332_570_J_F_1333_570_J_R 5513GCTACCCTCCTGCTTTTCCT 5546 ATCAATCCAGGCAACATGC 570 B1897_570_B_F_1898_570_B_R 5514 TGGTGCTATTCCTGAACGGG 5547GCCGTGCAGTTGAGCAGG 581 C 1362_581_C_F_1363_581_C_R 5515TTCCGTGACTCTGGGATCTT 5548 ATGAACCTCAACACCCAAGG 581 D1364_581_D_F_1365_581_D_R 5516 GGAAAACCTTGCTTGTGGAA 5549TGTTGGAACAGACCTGATTTTC 581 E 1366_581_E_F_1367_581_E_R 5517TGAGGGGAGAGATACAGGTGA 5550 TGTTGCCACACAACACAATG 581 E1368_581_E_F_1369_581_E_R 5518 ACAAGAATGTGCCTAACTGGC 5551GACTCCGTCTTGGGGAAAA 581 F 1370_581_F_F_1371_581_F_R 5519ACCATGCCTTGCCAAGAA 5552 GCTCATACTGTGCTGCCAAA 581 F1524_581_F_F_1525_581_F_R 5520 CAGTACTACGACATTTCTGCCAA 5553GGAATAAACAAGCCAAACCG 581 G 1374_581_G_F_1375_581_G_R 5521GATTGTTCGGTTTGGCTTGT 5554 TCAGCATCCCACAGATGAAG 698 A1334_698_A_F_1335_698_A_R 5522 GACCAGAATCCCAAGAGCAC 5555TGCTGTGATTGCCCTAACAA 698 B 1336_698_B_F_1337_698_B_R 5523TTTTGCCCACTGAGATGCTA 5556 AAATCCAGTGGCTTCCTTCC 698 C1338_698_C_F_1339_698_C_R 5524 ACTGCTTTGTCTCCTGGGAA 5557CACAAAACTGAAACCCTGCC 698 E 1342_698_E_F_1343_698_E_R 5525TGTTTGGCTTGATCACTGAGA 5558 TGACTGCCAAGCAATTTTCA 698 F1344_698_F_F_1345_698_F_R 5526 AGGAAGGTGTTTATGCACGG 5559GCTCTTTCACCGAAAACTGC 698 G 1520_698_G_F_1521_698_G_R 5527CAGGTGAGTTTAGTTTCCTGTCC 5560 CCTCCCATCTTGCAGTTCAT 698 G1522_698_G_F_1523_698_G_R 5561 TCAGGTTGTCTGTCTGTTGTCA 5594AAACGGCATCTACCAATTAAATC 698 H 1348_698_H_F_1349_698_H_R 5562CATCCCCGTGAGTTTGATTT 5595 CTCACTGCCACCCACAGTAG 698 I1350_698_I_F_1351_598_I_R 5563 TCCTGCTCCTTCTGTGTAAGG 5596TTTCTGGAAGACCCCAGTTT 698 J 1352_698_J_F_1353_698_J_R 5564TGTGTCGTAGGCATGAATTG 5597 CCCTCATCCTTTCATCTTGTG 698 K1354_698_K_F_1355_698_K_R 5565 GGAGCATGTGAACACCTGAA 5598GAAACCACCACCAAGGAGAA 698 L 1356_698_L_F_1357_698_L_R 5566AGTTTTCAGCACATCCGTGT 5599 GCCTTTTAAACCACAGCTATTTC 698 M1358_698_M_F_1359_698_M_R 5567 TTGACCTACAAGCTGTGCCA 5600CTCTGGCCAACAAGAAAAGC 698 M 1360_698_M_F_1361_698_M_R 5568TCCTTCCACTAAAGGGTGTCA 5601 TCCTAATCCCCTTCCCAAGT 698 D1518_698_D_F_1519_698_D_R 5569 TGTGTCTTCTTGCTGTGTCTCT 5602ACCATTGTTATTCCGGGCT 702 A 630_702_A_F_631_702_A_R 5570GGCCAGGGACATCAGGTT 5603 GTCTGCAGCTGCCCTGTT 702 A 632_702_A_F_633_702_A_R5571 CCCCTCACCCTGCTCTCT 5604 CATAAGACGGGACTGTGCCT 702 B634_702_B_F_635_702_B_R 5572 AGTGAGCTGGGCTAGGCTCT 5605GGAGACCCCGTTCCTCAC 702 C 636_702_C_F_637_702_C_R 5573CTGCTCCTCATCCTCACAGG 5606 CCCTGAACTTCCACGAGGT 702 C638_702_C_F_639_702_C_R 5574 GTCGAAGGGGTAGCCGTC 5607CCTGTTCTCCGTGACTCACTC 702 D 640_702_D_F_641_702_D_R 5575GGGGTTTCTGACCCCTCTT 5608 CAGTGGCTGTCCACGAGTT 702 D642_702_D_F_643_702_D_R 5576 ACCTTGTCCTCGTAGGGGAG 5609GCCCTTCTTGCCCTTAGTTC 702 E 644_702_E_F_645_702_E_R 5577CAGAGCCTGTCTGCTGAGTG 5610 GGACAGGGATGAGGACAGAC 702 F646_702_F_F_647_702_F_R 5578 CACACAAGGATGCCTGTCC 5611 GGTCTGCACCCAGAGTGG702 G 648_702_G_F_649_702_G_R 5579 TGGGTGCAGACCGTCTCT 5612CTCCATGAGGCGGACAGA 702 H 650_702_H_F_651_702_H_R 5580CTTGGCTGCCCTGTAGTGAT 5613 CATCGACGCTGCCTTCTC 702 H652_702_H_F_653_702_H_R 5581 CCTCGTGTGGTCATCGTAAC 5614GGCTGACACAGGAGAAGGAA 702 I 654_702_I_F_655_702_I_R 5582CGAGGGTACCCACTCCCAT 5615 ACCAACCCCACCCACACT 702 I656_702_I_F_657_702_I_R 5583 AGCAGGGAGAGGTCATGTTG 5616CAGAAGGGTGCCCAGTCA 702 I 658_702_I_F_659_702_I_R 5584 CCGAGATGCTCCCTCCAG5617 CACAGAGGGCAAGGACTGTG 702 I 660_702_I_F_661_702_I_R 5585TCGTCAGTCAACACAGTCCC 5618 CCAGGCCCTGACGCTATG 702 I662_702_I_F_663_702_I_R 5586 CACAGTCCTTGCCCTCTGTG 5619GCCCCTCCAGGACAACAT 702 I 664_702_I_F_665_702_I_R 5587GTGCATGAGCAGACCTCGTA 5620 TGCCTCCTACTTCTTCCGTG 702 I666_702_I_F_667_702_I_R 5588 CTCCACACACCAGCCAGTC 5621 CAGTCTTGTGCAAGCCCC722 B 382_722_B_F_510_722_B_R 5589 TTCAGTTCGCTATTTGTGCC 5622GGACAGGTAGGCAGGCTATG 722 C 813_722_C_F_814_722_C_R 5590GATTTGAGTTTGCCATGCTGT 5623 ACAGCCAGAGGGACACACA 722 D386_722_D_F_387_722_D_R 5591 ATGTTGGATATTATAGCTCAGATGC 5624CAAATACCCATACTCCCAACATC 722 E 388_722_E_F_389_722_E_R 5592TTGAAGTCAGGCTTGGAACA 5625 TTCAGAGTCTGCAAGAAGAAAGT 722 F390_722_F_F_391_722_F_R 5593 ATGGCCCTCAGATACGAATG 5626TTGAAGTGAGACCTTAAGGGAGA 722 G 512_722_G_F_513_722_G_R 5627ATGGTTGCAAATGGCTTTGT 5652 ACAGAAGAGGACATGGAGCC 722 H394_722_H_F_395_722_H_R 5628 CCCTTTAACTTCCAAACCCA 5653TCTTGGAGAATGCAAGAGTCTG 722 I 396_722_I_F_397_722_I_R 5629CCATTACATGCACATCGTGTT 5654 TCTTCGAAGCCAAACTCACC 722 J1526_722_J_F_1527_722_J_R 5630 GCAAATGCCATTGTTGATTT 5655CGGGTTACAGCGTCTGAGAT 722 AA 739_722_AA_F_740_722_AA_R 5631TCAGCTTGCTTTTCTTTGACA 5656 GTGGCTGGCAAGCTTTTATT 722 A1901_722_A_F_1902_722_A_R 5632 GGGCTCCCGCTGGAAAG 5657 GGCCTGAACCGCTACCC748 A 1995_748_A_F_1996_748_A_R 5633 TAGCATCCACCTGTGGTCCC 5658CAGAAGCCAGAAGGGCAAAG 748 A 1997_748_A_F_1998_748_A_R 5634GCTTCCATGGTTGCTTAAAA 5659 TGCCTTTCAATCAGTAGAAGAAC 748 A1999_748_A_F_2000_748_A_R 5635 TAAGAATGGGTTCGAGGGTG 5660TGGTTGAGAGAGCAAGAGGAA 751 U 1945_751_U_F_1946_751_U_R 5636GGTGCTACCTCCTCTGATCCT 5661 CACCTGCAGCCTCATGGTA 751 V1947_751_V_F_1948_751_V_R 5637 TAGCCTGTGGTGAGGGCAGT 5662TCCTGTGACCTCAAAGCATCC 751 W 1949_751_W_F_1950_751_W_R 5638TGCCACTCAGGGTGACTGT 5663 TGCAAGCCTGCTCCTGAT 751 X1951_751_X_F_1952_751_X_R 5639 CCTAACTACGTGCAAAGGGC 5664GCTCAGGATTTGAGTCCCAG 751 Y 1953_751_Y_F_1954_751_Y_R 5640ATTTCCAAATCCCAACCTCC 5665 CTGGGACCCTCGGTTTATG 751 Z1955_751_Z_F_1956_751_Z_R 5641 TCACTGGGCTTATGGCTCTC 5666GTCCATGAGCAAAGGTGGAG 848 Y 2001_848_Y_F_2002_848_Y_R 5642GCCTCCAACTTTGCCTCTC 5667 TAAAACGCAAATCCCACCTC 848 Y2003_848_Y_F_2002_848_Y_R 5643 TCTCCTCGCCCTCTCTCTG 5668TAAAACGCAAATCCCACCTC 848 Z 2004_848_Z_F_2005_848_Z_R 5644CATTTGTCTTCACTGGCCG 5669 TGGTGTCTGCCGCTGATT GenR2 A1453_GenR2_A_F_1454_GenR2_A_R 5645 CCAAGCCCCAAATTTAAGTG 5670CCTCTCGCCTAAAACTGTGC GenR2 B 1455_GenR2_B_F_1456_GenR2_B_R 5646CATTTCTTGGCACACAATGG 5671 TGGTTGAGCCACCATACTCA GenR2 C1457_GenR2_C_F_1458_GenR2_C_R 5647 TATTTCACCCAGGAGGTTCG 5672TGTTGCCAAGAATGTGGAAA GenR2 D 1459_GenR2_D_F_1460_GenR2_D_R 5648TCCTCCTAGGAACAGAGCCA 5673 ATGCACTCAGCGACCTTCTC GenR2 F1575_GenR2_F_F_1576_GenR2_F_R 5649 GTCTTTCCCATCCCTCAACA 5674GGGAGGCATAATGAACCAGA GenR2 F 1577_GenR2_F_F_1578_GenR2_F_R 5650TAGCGCCCTATCCCTTTCTT 5675 TCCATCCCAAGCTTCACTCT GenR2 E1790_GenR2_E_F_1791_GenR2_E_R 5651 CTCTGACCTTGCACTACCCC 5676CCACCGTGTCTTCAAATTCA

Comparative DNA sequencing was used to determine the sequence changes inthe genes in 12q23-qter. Variants detected by SSCP analysis in theinitial set of asthmatic and normal individuals were analyzed byfluorescent sequencing on an ABI 377 automated sequencer (Perkin-ElmerApplied Biosystems Division). Sequencing was performed using AmershamEnergy Transfer Dye Primer chemistry (Amersham-Pharmacia Biotech)following the standard protocol described by the manufacturer. Primersused for dye primer sequencing are shown in Table 9. Column 1 lists thegenes targeted for sequencing. Column 2 lists the specific exonssequenced. Columns 3 and 4 list the forward primer names and the forwardprimer sequences, respectively. Columns 5 and 6 list the reverse primernames and reverse primer sequences, respectively.

TABLE 9 SEQUENCING PRIMERS SEQ ID Gene Exon Forward PrimerForward Sequence NO: 454 B MDSeq_118_454_B_F CCAGATACTGGGCAAAGGAG 5677454 E MDSeq_119_454_E_F AGCCAGCAGAATCCACAGTC 5678 454 EMDSeq_473_454_E_F TCCTGTTACTCTCCTGCGGT 5679 454 F MDSeq_120_454_F_FACAGCAAGGAGGAAGTCCG 5680 454 G MDSeq_121_454_G_F TTCTCCCAGAGCAAGTGACC5681 454 H MDSeq_122_454_H_F AGTGCCCTGAATTCCAGTCT 5682 454 HMDSeq_291_454_H_F AGTGCCCTGAATTCCAGTCT 5683 454 K MDSeq_123_454_K_FCCCAGAACCCAGCACTTTC 5684 454 L MDSeq_124_454_L_F GTCTCCCCTTAATGTGTGGG5685 454 M MDSeq_125_454_M_F CCAGCACTTGAACGCATCTA 5686 454 NMDSeq_126_454_N_F AGCATGGGGTTCCCATTT 5687 454 O MDSeq_127_454_O_FCGATTCCTGGACAACCAGA 5688 454 O MDSeq_128_454_O_F GAACACATGCATGGTCCTGA5689 454 AA MDSeq_460_454_AA_F CTCAACTCCACTTTCCTCGG 5690 454 AAMDSeq_470_454_AA_F TGCATCTTTGAGTGACTGCTG 5691 454 AA MDSeq_471_454_AA_FTCTTGTGACATTTGCAAGGC 5692 757 A MDSeq_407_757_A_F CTCGCTTCCCGGTATTGTT5693 757 A MDSeq_408_757_A_F TTCTTCCTGTGCTCGCTGTA 5694 757 AMDSeq_409_757_A_F CGTGGACGTGTACTGGAGC 5695 757 A MDSeq_410_757_A_FAGCCAACAGCAGCTACTTCC 5696 757 A MDSeq_411_757_A_F TCTTTATGCTGCTGGTGGTG5697 757 A MDSeq_412_757_A_F AGGGAAGCTCCTCCAGTGA 5698 757 AMDSeq_413_757_A_F TGAACTCAAACGATGTGCAA 5699 757 A MDSeq_418_757_A_FCTCGCTTCCCGGTATTGTT 5700 757 A MDSeq_419_757_A_F AGGGAAGCTCCTCCAGTGA5701 757 A MDSeq_421_757_A_F CAAACTTTGCTGCTCTCCG 5702 757 AMDSeq_422_757_A_F CAAGAAGAGGCCGAAGTTTG 5729 757 A MDSeq_423_757_A_FGAGGACACGTCCAACGCC 5730 757 A MDSeq_424_757_A_F CAAGAAGAGGCCGAAGTTTG5731 757 A MDSeq_425_757_A_F GAGGACACGTCCAACGCC 5732 561 BMDSeq_169_561_B_F ACTGCTCTCCCGTGAAAGTG 5733 561 C MDSeq_170_561_C_FTTAAGCCAAGGAAAGGAGCA 5734 561 E MDSeq_171_561_E_F ATCTGTGTGTGTGAGCTGGC5735 561 H MDSeq_172_561_H_F AAATGGTTGACGTCACTGGC 5736 561 JMDSeq_173_561_J_F TGTTGGAGCTGAGAGACCTG 5737 561 H MDSeq_174_561_H_FCTCTGGGCAGAGGACTGGT 5738 561 M MDSeq_177_561_M_F ACCCTGCCTGATGAGAAGAA5739 561 P MDSeq_183_561_P_F AGGCAGATTCCTCAGCTCCT 5740 561 GMDSeq_390_561_G_F GCATTTCCCAGAAGATGGTG 5741 561 H MDSeq_392_561_H_FCTCTGGGCAGAGGACTGGT 5742 561 X MDSeq_401_561_X_F GAACTGCCCTGTCCATCTGT5743 561 Y MDSeq_402_561_Y_F ACAACTCCAATTGGCGAGAA 5744 561 XMDSeq_415_561_X_F GAACTGCCCTGTCCATCTGT 5745 561 X MDSeq_417_561_X_FGAACTGCCCTGTCCATCTGT 5746 214 B MDSeq_15_214_B_F GACAGTCTGCTCCACATCCA5747 214 C MDSeq_110_214_C_F ATATGTTTGCTGGCTTTGGG 5748 214 EMDSeq_343_214_E_F TGCTTCCTGTTTGTCACTGC 5749 214 E MDSeq_383_214_E_FATGGACCTGGGTGAGGACTT 5750 214 AA MDSeq_399_214_AA_F CGAATAAAGGCGTCGAGAAG5751 224 BB MDSeq_403_224_BB_F AATTGACTTTCCCGCCTTCT 5752 422 EMDSeq_431_422_E_F AAGCATCTTGGCGAAGTCAT 5753 422 F MDSeq_434_422_F_FTGGGCATCCTGATGTACTTG 5754 422 C MDSeq_323_436_C_F TGTGAAAAGTGTTGCTCTGAA5755 422 D MDSeq_324_436_D_F TGTGAAAAGTGTTGCTCTGAA 5756 422 EMDSeq_325_436_E_F TCTTTAGCTTGGCATCACCC 5757 422 G MDSeq_326_436_G_FCTGCACTCGAGGTGACAGAG 5758 422 K MDSeq_327_436_K_F GCTAGGCATGGTGAGTGGTT5759 422 B MDSeq_340_436_B_F CCATCAGTGTGCTGAGTGCT 5760 422 LMDSeq_374_436_L_F GCACAGGCCTCTCATCTCTT 5793 422 A MDSeq_375_436_A_FCAAGATTCCTCTCACCTCGG 5794 422 C MDSeq_393_436_C_F TCACTGTTTTCCATTGGGTTA5795 422 D MDSeq_394_436_D_F TCACTGTTTTCCATTGGGTTA 5796 422 GMDSeq_395_436_G_F GGCTGCAGAAAACTTCACTCT 5797 422 A MDSeq_396_436_A_FGCTGGGATGACAGGTGTGAG 5798 422 A MDSeq_404_436_A_F AGGAGCCTTTCGTCCTCAA5799 422 D MDSeq_414_436_D_F TCACTGTTTTCCATTGGGTTA 5800 422 DMDSeq_416_436_D_F TCACTGTTTTCCATTGGGTTA 5801 436 C MDSeq_323_436_C_FTGTGAAAAGTGTTGCTCTGAA 5802 436 D MDSeq_324_436_D_F TGTGAAAAGTGTTGCTCTGAA5803 436 E MDSeq_325_436_E_F TCTTTAGCTTGGCATCACCC 5804 436 KMDSeq_327_436_K_F GCTAGGCATGGTGAGTGGTT 5805 436 B MDSeq_340_436_B_FCCATCAGTGTGCTGAGTGCT 5806 436 L MDSeq_374_436_L_F GCACAGGCCTCTCATCTCTT5807 436 C MDSeq_393_436_C_F TCACTGTTTTCCATTGGGTTA 5808 436 DMDSeq_394_436_D_F TCACTGTTTTCCATTGGGTTA 5809 436 G MDSeq_395_436_G_FGGCTGCAGAAAACTTCACTCT 5810 436 A MDSeq_404_436_A_F AGGAGCCTTTCGTCCTCAA5811 436 D MDSeq_414_436_D_F TCACTGTTTTCCATTGGGTTA 5812 436 DMDSeq_416_436_D_F TCACTGTTTTCCATTGGGTTA 5813 449 D MDSeq_462_449_D_FGTCACACAGCCAGTAGGCAG 5814 449 F MDSeq_463_449_F_F AAGAGAAAATCCGGAGGACC5815 449 A MDSeq_472_449_A_F CCAACTTCAGTTTCCCAACG 5816 449 FMDSeq_474_449_F_F CACATATCTGCCCTGCTCCT 5817 515 A MDSeq_235_515_A_FCAGCCATCATCTCTTGCCTT 5818 515 A MDSeq_236_515_A_F TGGACCTGGCGTTCTTTATC5819 515 A MDSeq_237_515_A_F CGTAGTTTCCTGGTAACCATTCA 5820 515 AMDSeq_239_515_A_F GGCCATCGTCTTTGTCATCT 5821 515 A MDSeq_263_515_A_FCTGCTGTGTGTTCCGAGATG 5822 515 A MDSeq_265_515_A_F GGCCATCGTCTTTGTCATCT5823 570 C MDSeq_266_570_C_F TTGATTGTGTTGCGCTTCTT 5824 570 FMDSeq_268_570_F_F CACCTGATTATTTTCCCCTCA 5857 570 I MDSeq_270_570_I_FCTGAGTGAGCGGAGGTGTTT 5858 570 J MDSeq_271_570_J_F CAGACAGCCCACCTCCAG5859 570 I MDSeq_294_570_I_F GCTGGCACTGGTGTCTATCA 5860 581 EMDSeq_277_581_E_F GGGAGATTTGATAGGGTCAGC 5861 581 F MDSeq_345_581_F_FCCTTCTGAGTAGCTGGGCTC 5862 698 B MDSeq_274_698_B_F TGTCCTGGACCATCACAGTT5863 698 E MDSeq_275_698_E_F GTAAGCATTTGTGTGGCAGC 5864 698 HMDSeq_280_698_H_F TGTGTACAGATTGCCCTACCC 5865 698 I MDSeq_287_698_I_FGACAGCGCCTCTGGGTATTA 5866 702 C MDSeq_111_702_C_F GTGATGAGGACAAGCTCGG5867 702 D MDSeq_112_702_D_F CAACCCTGCCTGTCGTAACT 5868 702 AMDSeq_113_702_A_F TTCCCACCACTCTCCTGC 5869 702 B MDSeq_114_702_B_FCCCTCTGATCAGGCACAGTC 5870 702 F MDSeq_115_702_F_F ACGCTTCTTGTAGGACCGAA5871 702 I MDSeq_116_702_I_F AGCAGGGAGAGGTCATGTTG 5872 702 IMDSeq_117_702_I_F CACTAGGGGACAGCTCCGT 5873 702 B MDSeq_178_702_B_FAGGCACAGTCCCGTCTTATG 5874 702 I MDSeq_179_702_I_F TCGTCAGTCAACACAGTCCC5875 702 C MDSeq_191_702_C_F AGATCGGCCTAGTGGGAAAT 5876 702 IMDSeq_196_702_I_F CAGTCTTGTGCAAGCCCC 5877 702 I MDSeq_269_702_I_FAGCAGGGAGAGGTCATGTTG 5878 722 F MDSeq_63_722_F_F TAAGTAGGGTTGTGACCGGC5879 722 C MDSeq_132_722_C_F ACCTGATAGGTTTTCCCGGT 5880 722 AAMDSeq_135_722_AA_F GACACGATCCTGGCTCTCTG 5881 722 B MDSeq_141 722_B_FTTCAGCCAGGATCTGTTGTG 5882 722 B MDSeq_146 722_B_F TGCAACACCAGCAGTTTCAC5883 722 G MDSeq_150_722_G_F CAGTGTGCCGAGACATTGTT 5884 722 AMDSeq_441_722_A_F TATTACCCAAAGCTGCACCC 5885 751 U MDSeq_455_751_U_FAGACACTCTCCAGCTCTCGC 5886 751 W MDSeq_456_751_W_F CTCCCAGGTAAATGCCTCAA5887 GenR2 F MDSeq_420_GenR2_F_F CCCAGGAGACAGAGGTTTCA 5888 SEQ ID GeneExon Reverse Primer Reverse Sequence NO: 454 B MDSeq_118_454_B_RGCACCAGGACATGAGGCTAT 5703 454 E MDSeq_119_454_E_R GGTACCCTGGAAGATCTGGG5704 454 E MDSeq_473_454_E_R CCAACTCACGCAAAGAATGA 5705 454 FMDSeq_120_454_F_R TGGAAAAGGGTTCTCCAGC 5706 454 G MDSeq_121_454_G_RCCACAGGAAAGGAATACACCA 5707 454 H MDSeq_122_454_H_R CATTCATCTTGTTGCCTTGG5708 454 H MDSeq_291_454_H_R CATTCATCTTGTTGCCTTGG 5709 454 KMDSeq_123_454_K_R TAGAATTGCTTTCCAGGCCC 5710 454 L MDSeq_124_454_L_RGGGCCTAATTTTCGTGCAT 5711 454 M MDSeq_125_454_M_R CTTCCCTCTATCTTGCCCCT5712 454 N MDSeq_126_454_N_R ATTGGAAGGGGGCATAAAAG 5713 454 OMDSeq_127_454_O_R GGACAGTTTGCTGTGCCTC 5714 454 O MDSeq_128_454_O_RACAGACAGGATTTCGCCTTG 5715 454 AA MDSeq_460_454_AA_R CAAGAAGCGCCAAGTCCTAC5716 454 AA MDSeq_470_454_AA_R ACTCTGGTCTGCAGTTGGTG 5717 454 AAMDSeq_471_454_AA_R TCAGAATGTGCACCTGAAGC 5718 757 A MDSeq_407_757_A_RGCCTCCATGCACAGGTAGTT 5719 757 A MDSeq_408_757_A_R CTCTCCAGTCCCTCCTGGAT5720 757 A MDSeq_409_757_A_R CTCCAGCTTGTCCGTGTTCT 5721 757 AMDSeq_410_757_A_R GACTGGGCAGGGATCTCATA 5722 757 A MDSeq_411_757_A_RGGGTCCTGTCTTTCCTCTGC 5723 757 A MDSeq_412_757_A_R TCTGCCAACCTAGTGCTTCC5724 757 A MDSeq_413_757_A_R TTCCAACTTCACACATTGCC 5725 757 AMDSeq_418_757_A_R GCCTCCATGCACAGGTAGTT 5726 757 A MDSeq_419_757_A_RTCTGCCAACCTAGTGCTTCC 5727 757 A MDSeq_421_757_A_R AGTTGGGGTCGTTCTTGTTG5728 757 A MDSeq_422_757_A_R TACAGCGAGCACAGGAAGAA 5761 757 AMDSeq_423_757_A_R CTCGTCCGAGCCGTTGTT 5762 757 A MDSeq_424_757_A_RTACAGCGAGCACAGGAAGAA 5763 757 A MDSeq_425_757_A_R CTCGTCCGAGCCGTTGTT5764 561 B MDSeq_169_561_B_R CCATCAGCATCTGTGTGACC 5765 561 CMDSeq_170_561_C_R CCTCGATGGGATTTGCTTT 5766 561 E MDSeq_171_561_E_RGGGTGCTGAAAGACAAGAGC 5767 561 H MDSeq_172_561_H_R CTGTGGCTGTGGCAGGAT5768 561 J MDSeq_173_561_J_R CCTCTAAACTCCTTTACCCAGACC 5769 561 HMDSeq_174_561_H_R TGACAGAGTCCACCAGCAAA 5770 561 M MDSeq_177_561_M_RTGTTTGCAAGCAAGACGGTA 5771 561 P MDSeq_183_561_P_R CAGAGGGCAAATAACCTCCA5772 561 G MDSeq_390_561_G_R TAATCCAGAGCAGAGCAGGG 5773 561 HMDSeq_392_561_H_R TGACAGAGTCCACCAGCAAA 5774 561 X MDSeq_401_561_X_RAAATCTCAGGCTGGGAGGAC 5775 561 Y MDSeq_402_561_Y_R CCAAGCAGAGATAACCAGCA5776 561 X MDSeq_415_561_X_R AAATCTCAGGCTGGGAGGAC 5777 561 XMDSeq_417_561_X_R AAATCTCAGGCTGGGAGGAC 5778 214 B MDSeq_15_214_B_RTGGAGATGAAGTCTTGCTCT 5779 214 C MDSeq_110_214_C_R CCCAGGCTGTGTGTCCTCTA5780 214 E MDSeq_343_214_E_R TGAGGACACGATGAACCTGA 5781 214 EMDSeq_383_214_E_R GCAGTGACAAACAGGAAGCA 5782 214 AA MDSeq_399_214_AA_RCCTTCCTGGAGAGGACGTG 5783 224 BB MDSeq_403_224_BB_R GCCCAGCCATCCTTCTACTT5784 422 E MDSeq_431_422_E_R AAAGGAGACACTGCCCAGAA 5785 422 FMDSeq_434_422_F_R GTGGTGCATGCCTATGGTC 5786 422 C MDSeq_323_436_C_RAGTTTGGGTGACAGAGCG 5787 422 D MDSeq_324_436_D_R AGTTTGGGTGACAGAGCG 5788422 E MDSeq_325_436_E_R ACGCAGAGTTGAAGGTGCTT 5789 422 GMDSeq_326_436_G_R AGCCAGGAGATACGTTGTGC 5790 422 K MDSeq_327_436_K_RCGCAAGGTGCACTGAAGTAG 5791 422 B MDSeq_340_436_B_R ACCCAAAATGTGGAAAGGTG5792 422 L MDSeq_374_436_L_R AGAGTTGACCCAGCCAAGAA 5825 422 AMDSeq_375_436_A_R AACAGCAGCAAGCAGCCT 5826 422 C MDSeq_393_436_C_RGTAGGGCAAGAGCTGGGATG 5827 422 D MDSeq_394_436_D_R GTAGGGCAAGAGCTGGGATG5828 422 G T MDSeq_395_436_G_R TGAGTGCTGGTCTTCAGTGG 5829 422 AMDSeq_396_436_A_R TCCCAAAGTGCTCGGATTAC 5830 422 A MDSeq_404_436_A_RATGTTGCCCAAATTGGTTTC 5831 422 D MDSeq_414_436_D_R GTAGGGCAAGAGCTGGGATG5832 422 D MDSeq_416_436_D_R GTAGGGCAAGAGCTGGGATG 5833 436 CMDSeq_323_436_C_R AGTTTGGGTGACAGAGCG 5834 436 D MDSeq_324_436_D_RAGTTTGGGTGACAGAGCG 5835 436 E MDSeq_325_436_E_R ACGCAGAGTTGAAGGTGCTT5836 436 K MDSeq_327_436_K_R CGCAAGGTGCACTGAAGTAG 5837 436 BMDSeq_340_436_B_R ACCCAAAATGTGGAAAGGTG 5838 436 L MDSeq_374_436_L_RAGAGTTGACCCAGCCAAGAA 5839 436 C MDSeq_393_436_C_R GTAGGGCAAGAGCTGGGATG5840 436 D MDSeq_394_436_D_R GTAGGGCAAGAGCTGGGATG 5841 436 GT MDSeq_395_436_G_R TGAGTGCTGGTCTTCAGTGG 5842 436 A MDSeq_404_436_A_RATGTTGCCCAAATTGGTTTC 5843 436 D MDSeq_414_436_D_R GTAGGGCAAGAGCTGGGATG5844 436 D MDSeq_416_436_D_R GTAGGGCAAGAGCTGGGATG 5845 449 DMDSeq_462_449_D_R CAGAGAGCAAGAAGGCCAAG 5846 449 F MDSeq_463_449_F_RACGGGGTCTCCCTGTGATA 5847 449 A MDSeq_472_449_A_R CAGGGACGTGGACTCTGATA5848 449 F MDSeq_474_449_F_R CACCATCAGGATTCTTCACG 5849 515 AMDSeq_235_515_A_R ATTACTCGATGCAACAGCCC 5850 515 A MDSeq_236_515_A_RCAGGAGCAACACAATTCCCT 5851 515 A MDSeq_237_515_A_R TTGGAGATCTTGTTCAGGGC5852 515 A MDSeq_239_515_A_R GCGTCAGAGATGAAGCAAGT 5853 515 AMDSeq_263_515_A_R GTGTGCAGGAGCCAGAAGAT 5854 515 A MDSeq_265_515_A_RGCGTCAGAGATGAAGCAAGT 5855 570 C MDSeq_266_570_C_R GCATGAGCTCTGGAATCAGG5856 570 F MDSeq_268_570_F_R AACCTCCCTTTAACTCAGTC 5889 570 IMDSeq_270_570_I_R TTGGCAATTTCTTTCATCAG 5890 570 J MDSeq_271_570_J_RCCAAGACTTTGCAATCTCCA 5891 570 I MDSeq_294_570_I_R CCACGTAGGAATGGAGCTGT5892 581 E MDSeq_277_581_E_R TAGCCAGGCGTGGTGGTA 5893 581 FMDSeq_345_581_F_R TAGACTTCTGACGCTGGGCT 5894 698 B MDSeq_274_698_B_RCGGCTAAGTCTTTCATCACG 5895 698 E MDSeq_275_698_E_R TGCCAAGGGCTGTTTCTAAT5896 698 H MDSeq_280_698_H_R TGACGAATACAGGATGAAAGTC 5897 698 IMDSeq_287_698_I_R TGAAACAGGCCAGAGAAGTTT 5898 702 C MDSeq_111_702_C_RACGTTCCCACGGGACTCA 5899 702 D MDSeq_112_702_D_R CGCTCCATGAATGGTACAAA5900 702 A MDSeq_113_702_A_R AAGGGTGGGAGCCCTGAC 5901 702 BMDSeq_114_702_B_R GGATATCTACAGCAGGCCCA 5902 702 F MDSeq_115_702_F_RAAGACGATCTTGTGGTCGCT 5903 702 I MDSeq_116_702_I_R GGTGTGTGGAGACTCACAGG5904 702 I MDSeq_117_702_I_R CTGCCATCTAGCACGAGCC 5905 702 BMDSeq_178_702_B_R GAGAGCTCCTGCTGCTGTCT 5906 702 I MDSeq_179_702_I_RCCCACTGCAGTCTTGTGC 5907 702 C MDSeq_191_702_C_R GCTCTCATTTCCCTCCCTC 5908702 I MDSeq_196_702_I_R CACAGTCCTTGCCCTCTGTG 5909 702 IMDSeq_269_702_I_R GGTGTGTGGAGACTCACAGG 5910 722 F MDSeq_63_722_F_RCACTCTCCCAATCTCCCTGA 5911 722 C MDSeq_132_722_C_R ATACAGATGCCCTGGCTCG5912 722 AA MDSeq_135_722_AA_R GCCTGGGTGACACAGCTA 5913 722 BMDSeq_141_722_B_R GGGCCTGGGAGTTACCTTAT 5914 722 B MDSeq_146_722_B_RACCTCTACGGCAGGCTGAAT 5915 722 G MDSeq_150_722_G_R TGAGTCTCCACAAACATAGC5916 722 A MDSeq_441_722_A_R TCAGGACTCCCTGAGACCC 5917 751 UMDSeq_455_751_U_R GCAGGACCCTGGACTACAGA 5918 751 W MDSeq_455_751_W_RTACTGTCCTCCATTCCCAGC 5919 GenR2 F MDSeq_420_GenR2_F_RCCCAGACTGGCTTTGAACTC 5920

Single nucleotide polymorphisms (SNPs) that were identified in genesfrom the disorder region are shown in Table 10. Column 1 lists the genenames. Column 2 lists the exons that either contain the SNPs or areflanked by intronic sequences that contain the SNPs. Column 3 lists thePMP sites for the SNPs. Column 4 lists the localization of the SNPs toexon, intron, or UTR sequences. Column 5 lists the SNP referencesequences and illustrates the SNP nucleotide changes with underlining.Column 6 lists the SEQ ID NOs of the SNP reference sequences. Column 7lists the base changes of the SNP sequences. Column 8 lists the aminoacid changes resulting from the SNP sequences.

The “−” symbols denote polymorphisms which are 5′ of the exon and arewithin the intronic region. The “−” polymorphisms are numbered goingfrom the 3′ to 5′ direction. The “+” symbols denote polymorphisms whichare 3′ of the exon and are within the intronic region. The “+”polymorphisms are numbered going from the 5′ to 3′ direction. The first,second, and third columns, combined, correspond to the SNP names asdescribed herein, e.g., 214_B_(—)1, 214_E_(—)+2, etc. It should be notedthat the disclosed SNPs are referred to herein using both short (e.g.,757_A_(—)+4) and long (e.g., Gene 757 A +4) nomenclature.

The genomic sequences corresponding to the genes in Table 10 are shownin Tables 3A and 3A. Taking the information from Tables 3A and 3B, incombination with the last column in Table 4, one of skill in the artcould identify the entire genomic sequence of the genes and SNPsdescribed below. For example, the genomic sequence for Gene 214 iscontained within BAC clones RP11-702C13 and AC079031 (see Table 4), andthe nucleotide sequence of BAC clone RP11-702C13 corresponds to SEQ IDNO:766 to SEQ ID NO:808.

TABLE 10  SNPs PMP  SEQ Gene Exon Site Location Sequence ID NO PMPAA change 214 B 1 3′ UTR CCTGTGCACTCTTGGGCATA C GCCTAGGAGTGGAACTGCTG5921 C > T 214 C −1 Intron GGGCTCTGCGCCACCTCAAC C CAGGCGTTTGTTCCGCAGGA5922 C > T 214 E +1 Intron AAGGACACATTCTTATCAGC T GTAGTCACCACGTTCATTAC5923 T > C 214 E +2 Intron CCCTGTGACCCTCAACTCCC G GTCCCCTCCAGCCCTGACAG5924 G > C 214 E +3 Intron CTCAACTCCCGGTCCCCTC**CAGCCCTGACAGCCACTGTT5925  **> TC 214 E −1 Intron AGGCCGCTTCAACCCTTCCT C CGGCAGGGGGCAATGGCCAA5926 C > T 214 E 1 Exon CACCTGCATTCCCTCTCTCT G TGAGTGTCCTGGGGCCCGTT 5927G > T Val > Leu 214 E 2 3′ UTR GGGCTCTGCGCCACCTCAAC CCAGGCGTTTGTTCCGCAGGA 5928 C > T 214 E 3 3′ UTR TCAGGAGCCTGTGCTTGACC CCCAAATCCGCCCCCCAACTC 5929 C > T Pro > Ser 422 E 1 ExonCAGACACATGACAACTGCTA T GACCAGGCCAAGAAGCTGGA 5930 T > C 422 E 2 ExonACCCACACCTATTCATACTC G TGCTCTGGCTCGGCAATCAC 5931 G > A 436 A +1 IntronGGCCGCGCGGGGGGCGCGGC G GGTGCTGCCCTCGCGTCCGC 5932 G > T 436 A +2 IntronCTGCTTGCTGCTGTTTTAAA G CCACAGCCTGGGCCAGGCGC 5933 G > A 436 A −1 IntronCCTTCCGGGCCATCATCCGC G ATGACGGCGCCGCCAGCAGG 5934 G > T 436 A −2 IntronGCCCTCCCCCGGGCCCCGGG*******CCCCGACCGCCCGT 5935  *******> CCCCGGG 436 A−3 Intron TCCTCAAGGGMGAGGCCACT CC CCCCCCCCGCGAGTTCCAT 5936 CC >**  436 A1 Exon CGGGCGGCGCGGCCATGGCGG G CTGCTGCGCCGCGCTGGCG 5937 G > T Gly > Cys436 A 2 Exon TGAACCGCGCCGTGCAACTG C TCATCCTGGCCTACGTCATC 5938 C > TLeu > Phe 436 C +1 Intron ATTCCAGATGCGACCACTGT G TGTAAATCAGATGCCAGCTG5939 G > A 436 C +2 Intron AACGGTACGAGCTTGTGGCC T CCTGGGGAGGGCAGCCCCTG5940 T > C 436 C +3 Intron TGTGGCCTCCTGGGGAGGGC A GCCCCTGAGCAGATCGCCCC5941 A > G 436 C −1 Intron CTCTCCGTGAGTCCTCTGAG C GTGGCTTGCCCGTGCTGTCT5942 C > T 436 D −1 Intron GTCCACCTGTGTGTGGGGCC G GGCCACGTGGACTTTCTTTT5943 G > A 436 D 1 Exon ATTCCAGATGCGACCACTGT G TGTAAATCAGATGCCAGCTG 5944G > A 436 E 1 Exon TGCGTAGCTTTCAACGGGTC C GTCAAGACGTGTGAGGTGGC 5945 C >T 436 G 1 Exon TAGTGGAGAACGCAGGACAC A GTTTCCAGGACATGGCCGTG 5946 A > GSer > Gly 436 K +1 Intron GGGAGGCCCTTCTGCAGAGG C TGGCACCAGTGTGGCGTGGT5947 C > G 436 K +2 Intron RCGACATCTCARGTTGGTGA T GATAATGCATGCTCTGAGAA5948 T > A 436 K −1 Intron GCTCACTCTCACCCTATGCT A AACTCAGGCGACCGTGCTGT5949 A > G 436 K −2 Intron GATTCCAGGCTTCTCAGGAA G GGGCACGCAAAGAATAAGAT5950 G > C 436 L −1 Intron CCAGGAGCGCACCTCCCTCC C GCCTGCCACAAGGGGTCCCA5951 C > T 436 L −2 Intron GGTGAAGTCCCAGGAGCGCA C CTCCCTCCCGCCTGCCACAA5952 C > A 436 L −3 Intron TGGCGTGGTGTCCCCGTTAA C CCGGGCAGTCCTGCCACTCT5953 C > T 436 L 1 3′ UTR AGCAGCTCCTGTGTGTTGTG T GCAGGATCTGTTTGCCCACT5954 T > C 454 B −1 Intron AAGTGCCTGCATCCTCCAA C GCCTGCATCCCAACCCGCTGT5955 C > T 454 B 1 Exon AGAGGTGAAAGAGGAGATCG T GGAGAATGGAGTGAAGAAGT 5956T > C Val > Ala 454 E −1 Intron CTCCTGGAGAACGTCCTCTC CGCAGTTCTTTCACATCTGTG 5957 C > T 454 E −2 Intron CAAAGCCTAGTCTCTCGCCC GGGTTGAGTTAATGATGTCCC 5958 G > A 454 E 1 Exon CCCCTATAGGAATTCAGACC GGAAGGTGTGTAGTGTATGAA 5959 G > A Gly > Arg 454 E 2 ExonAGACCGGAAGGTGTGTAGTG C ATGAAGGGAACCAGAAGACC 5960 C > T His > Tyr 454 E 3Exon TGTGAAGTCTCTGCCTGGTG C CCCATCGAGGCAGTGGAAGA 5961 C > T 454 F +1Intron TCTACTGCTGAGTAATAAAT T ATCCCAAACCTCAGAAGCCT 5962 T > C 454 F −1Intron CATGGGCTCCCTCGGTCCCC A CCGTCACTAATGGCCATTTT 5963 A > C 454 F −2Intron GGCCCATGGGCTCCCTCGGT C CCCACCGTCACTAATGGCCA 5964 C > T 454 F −3Intron TGTATCCATTTCTCTTCATG C ATCCCAAAGACCAAGCCAAG 5965 C > T 454 G −1Intron CACTCCTGGGAAAGAGACAG A TCTGTTTTCAATCGAGATGT 5966 A > T 454 H −1Intron GTTCTTCAATCAGCATTTTT C CTCTAAAAACCTTAAGCAAT 5967 C > T 454 H −2Intron TTTAGGACAATGAGTTTAAC G GTGATGTGTCCCAGACGGGG 5968 G > A 454 H 1Exon CCGTTGGTTCCATCACTGCC G TCCCAAATACAGTTTCCGTC 5969 G > A Arg > His454 H 2 Exon CCGTCCCAAATACAGTTTCC G TCGCCTTGACGACAAGACCA 5970 G > AArg > His 454 K 1 Exon GGAAAACAATGTTGAGAAAC G GACTCTGATAAAAGTCTTCG 5971G > A Arg > Gln 454 L −1 Intron ATTTCACCTGAGTAAACTCT CCCACTCTGTTTTTAGGGAGG 5972 C > T 454 M 2 Exon CATCGACTTCCTCATCGACA CTTACTCCAGTAACTGCTGTC 5973 C > G Thr > Ser 454 M +1 IntronGTTCACAGGACACCAAGACA C GGAGAGATTCCATGAAATCA 5974 C > T 454 M +2 IntronGTAGTGGATACGTCGCTGGG C TCTACCCCGATCAACCAACT 5975 C > T 454 M 1 ExonGTCTGCATTCTCCCCAGGCC A CTGTGTTCATCGACTTCCTC 5976 A > G Thr > Ala 454 O+1 Intron CTCACGTCTGTAATCCCAGC G CTTTGGGAGGCCGAGGCAGG 5977 G > A 454 O−1 Intron ATAAATCATGTAATATTAAA T GTAACTTTATAAGTTAATAA 5978 T > C 454 O 1Exon TGGACAACCAGAGGAGATAC A GCTGCTTAGAAAGGAGGCGA 5979 A > G Gln > Arg454 O 2 Exon CCTAGATCCAGGGATAGCCC C GTCTGGTGCCAGTGTGGAAG 5980 C > T 454O 3 Exon GAGCCACAGGTGCCTGGAGG A GCTGTGCTGCCGGAAAAAGC 5981 A > C Glu >Ala 454 O 4 Exon CTCTACCAGGAGCCCTTGCT G GCGCTGGATGTGGATTCCAC 5982 G > T454 O 5 Exon GGACATGGCTGACTTTGCCA T CCTGCCCAGCTGCTGCCGCT 5983 T > AIle > Asn 454 O 6 Exon AGGATCCGGAAAGAGTTTCC A AAGAGTGAAGGGCAGTACAG 5984A > G 515 A 1 Exon CAGCGTGGTTGTGCGGATCC G CATCTTCTGGCTCCTGCACA 5985 G >A Arg > His 515 A 2 Exon CGCTGCCTCCAGAGGAAGAT G ACAGGTGAGCCAGATAATAA5986 G > A Met > Ile 515 A 3 Exon GGCGCTCCAGAGGCGTTAAT GGCCAACTCCGGTGAGCCATG 5987 G > C MET > Ile 515 A 4 ExonGTCACTGGACTCGGCCTAAG G TTTCCTGGAACTTCCAGATT 5988 G > A Val > Ile 515 A 5Exon ACTTCCAGATTCAGAGAAT C TGATTTAGGGAAACTGTGGCA 5989 C > G Ile > MET515 A 6 3′ UTR CTTCCAGATTCAGAGAATC T GATTTAGGGAAACTGTGGCAG 5990 T > C515 A 7 Exon CTGGTTGCAAGGTGTGACC A CAGGAATCCTGGAGGAACAGA 5991 A > G 561B +1 Intron TGTGGTGGGGAGAGAATGGC C GTTGGCTGCCTGCGAGGGTG 5992 C > T 561 B+2 Intron CGAGGGTGTGCACAGGTGAA A TCGGTTTGGTGACACCTGGC 5993 A > G 561 B 1Exon AAGTTCCGGCAGCACGCTGG C AAGATTGACCTGCTGGGTGG 5994 C > G 561 C 1 ExonGAATATATCCGGCCCCTTCC G CAGCCTGGTGACAGGCCGGA 5995 G > A 561 E +1 IntronCAGGGCTCCCAACATACTCC T GGCCACCCAGCCCTCCTCTC 5996 T > C 561 E +2 IntronACTCCGTAGTTACCAGGTTT G CCCTCTTTGACGACTGGAAA 5997 G > C 561 E 1 ExonAGCTGAGCTGCCCCTCACGG C GGGAAAATACCTCTACGTCT 5998 C > T Ala > Val 561 G+2 Intron GGGTGGGGAGGGTTTGTTAG G CCCTAACGCAGCAGGGACCG 5999 G > A 561 G+3 Intron GTGGGGAGGGTTTGTTAGGC C CTAACGCAGCAGGGACCGGC 6000 C > T 561 G−1 Intron GCCAGGGCTGGTCCCTGAAC G CCTCCGTTCCCTTCTGTCCC 6001   G > A/C 561H −1 Intron GCTCACCTCGGGCAGCCCGC G AGCCAGCTCTGCTTGTCCAC 6002 G > A 561 H−2 Intron GGCTCCCCATTGCAGGACCGC G GGGGCTCACCTCGGGCAGC 6003 G >*  561 H 1Exon TCACCCAGCCGCATCCTGCC A CAGCCACAGGGCACCCCGGT 6004 A > G 561 J 1 ExonCTGGAAGATGGGGGAAGGAG G CGGCCCAGCGGCACGTCCCA 6005 G > A 561 M +1 IntronAAAATAGGTAAGCGCAAACC C CTATTCGACCTTCCCTGTGC 6006 C > A 561 M +2 IntronTATGCCAAAGTCATGTAAAT G TTGACCAGTGATTTTTCTTG 6007 G > A 561 M +3 IntronGCCAAAGTCATGTAAATGTT G ACCAGTGATTTTTCTTGGGC 6008 G > A 561 M +4 IntronTTGGGCAAAAGCCACCCTAC G AACCAGGACTGCCAGTAGTC 6009 G > A 561 P +1 IntronTAAGCAAACCTATTTAGCCT T TTTAATCTCTGTCCCGTTCT 6010 T > C 561 P 1 ExonGTGTTTTAGGGGGAGCTGAA T GGGCAGAAAGGCCTTGTGCC 6011 T > C 561 X −1 IntronTCTGTGAGGGTAAGGAACAC A TCTGCTCTGTTTACTACTTA 6012 A > T 561 X −2 IntronTCTCTCTGTGAGGGTAAGGA A CACATCTGCTCTGTTTACTA 6013 A > C 561 X −3 IntronGACACCCAGATTTTCAGGCA T CAAGTTCTTTCTTGCCTCAG 6014 T > A 561 Y +1 IntronATCTGGGGCCCTGGAGGGAG C GGGCTGGGCCAGGGAGGAAC 6015 C > G 561 Y +2 IntronTGAGGCACCCAGTGATGTCT C ATCCACTATCTGCTGGTTAT 6016 C > T 561 Y +3 IntronCCAGTGATGTCTCATCCACT A TCTGCTGGTTATCTCTGCTT 6017 A > G 561 Y −1 IntronTACCAAGTCTCTAAACATGG G GGCACCATCTCACATGTCCT 6018 G > C 561 Y −2 IntronTCCAATTGGCGAGAAGTTCC G TTGCTTTTTTAGGACACAGA 6019 G > A 561 Y −3 IntronCTCCAATTGGCGAGAAGTTC C GTTGCTTTTTTAGGACACAG 6020 C > T 570 C −1 IntronTTAACCACTTGACCGTATAT G GTTTTCATCCTTGAAGACTG 6021 G > C 570 C 1 ExonTTAGGTTAAAGATCGAGGTC C GGAAGCCACTAGGAGATTTT 6022 C > T Pro > Leu 570 C 2Exon AGGCGGTCTTGCTTTTGTGG T CTTCCTCTGTGGCAAGAGCG 6023 T > C Val > Ala570 C 3 Exon CTTTTGTGGTCTTCCTCTGT G GCAAGAGCGTTTTCATCACC 6024 G > AGly > Ser 570 C 4 Exon GAGGGCAGTGCTTTCACAGA C ATGTTCAAGATACTGACGTA 6025C > T 570 F +1 Intron GTTGTGGATTCAGAATATAG T GCTCACACGCAGTCGTGCCC 6026T > C 570 F −1 Intron AAGAAATCTTTTCCCAGTTC C GTTGTCTCTAAACTGAAGAG 6027C > T 570 F 1 Exon ATGTTCTTTGTCATGTGCTC G GCCTTTGCTGCAGGTAAGAG 6028 G >A 570 J 1 Exon TATTTGAACTATTACTTTTT T CTTCTGGCTGCTATTCAAGG 6029 T > C581 F +1 Intron TGTGGCCACTTTGCTGTTCA G ATTGTTCGGTTTGGCTTGTT 6030 G > C581 F +2 Intron CTTTGCTGTTCAGATTGTTC G GTTTGGCTTGTTTATTCCTG 6031 G > T581 F −1 Intron TGTACTATTGGCCTCAGGCA A TCCCACCTCAGCCCCCGAAA 6032 A > G698 B −1 Intron AGCCTTGCTATTGGCATCAG C TCTTTATTTTTTTAAAAAAT 6033 C > T698 B 1 Exon CGGGGCCCTGGGGGGACACT G CCAGGGCCTGCCATGCTCAT 6034 G > A 698E 1 Exon AGCCATGGGCATGCAAATGA G AAAAGCAATAATGTAAGTTA 6035 G > A Arg >Lys 698 I +1 Intron GTCTGCCTGCAAGGTTAGTC A CCTGTGGGGTTGCCATTCTA 6036 A >G 698 I +2 Intron GTTATTGATGGGCCCAGACT T TGGGAAGAACAGACGAGTTG 6037 T > C698 I −1 Intron TGATGCTGATACGGGATCTC T TGTATCCTGCTCCTTCTGTG 6038 T > C702 A −1 Intron TTTATTAAGACACTTTTCCG G CAGCTGCCCAGGGAAGAGAC 6039 G > A702 B +1 Intron ACCTGTCGTGGAGGTGGGTG T GTGGCCAGGGTGAGGAGCGG 6040 T > C702 B +2 Intron GGAGGTGGGCGCGTGGCCAG G GTGAGGAACGGGGTCTCCGT 6041 G > C702 B +3 Intron GGGTGCGTGGCCAGGGTGAG G AACAGGGTCTCCGTGGAGGT 6042 G > C702 B −2 Intron GTGCCAGAGTCAGGGCTCCC A CCCTTGCGGATGCTCGGGAT 6043 A > G702 C 1 Exon GCCCGACAGGCCAGCACCCA G CGAGGTCAGCCGGGCCGAGC 6044 G > AAla > Thr 702 D −1 Intron GGGATGCCTCGATGCCGGCT G CGCCAGAGGGATTCTGCAGG6045 G > A 702 D 1 Exon CCTCGTAGGGGAGCCCGTAG C GCAGCGGGTCACCCACCGGG 6046C > T Arg > His 702 F −1 Intron GCCCTGTCCCGCGCTGCCCA GGGCCCCGCCTCCCAGCCCAC 6047 G >*  702 F 1 Exon GACGCGGTGGCCCAGATCCG GGGTGAAGCTTTCTTCTTCAA 6048 G > C Arg > Pro 702 I 1 ExonTGTGTGGAGACTCACAGGCC G ATGGATCTGTGGCTGCGGGC 6049 G > A Asp > Asn 702 I 3Exon CCCAGAGGTGCATGAGCAGA C CTCGTAACCGTCCTCCGAGC 6050 G > A Val > Ile722 AA +2 Intron CACGCAGTACAGATAATGCC A TCTAGTGATACATCTGCCTG 6051 A > G722 AA −1 Intron GGATGTCTTTTAATGTGGCA A TATGAAATTAACCATGCATG 6052 A > G722 AA −2 Intron GCCACCACACCTGGCCAGGT C GTTTTATTTTAAATGAAGGA 6053 C > T722 AA −3 Intron CTCAGGTGATCCATCCGCCT C GGACTCCCAAAGTGCTGAGA 6054 C > G722 AA −4 Intron CTGACCTCAGGTGATCCATC C GCCTCGGACTCCCAAAGTGC 6055 C > T722 C 1 Exon GGTGGAGGAGATTAGAAACA G TATTGATAAAATAACTCAAT 6056 G > CSer > Thr 722 F +1 Intron AAGTGAGTAATGGAGACTCC G TCTTTGTTAAAATCATGTTT6057 G > A 722 G −1 Intron AAAAATGCTAACAACTATGA T TGTAGTTGCTAACTTATGGT6058 T > C 757 A +1 Intron ACTTTTGTTTAGAGCCCTCC G TAAATATACATCTGTGTATT6059 G > C 757 A +2 Intron GAGTTGCTTAAAATAGACTC C GGCCTTCACCAATAGTCTCT6060 C > T 757 A +3 Intron AGGCCCAGCCCTCAGAAACC C TTCAGTGCTACATTTTGTGG6061 C > T 757 A +4 Intron ACCAAGCCAATGTTATAGAC G TTTGGACTGATTTGTGGAAA6062 G > C 757 A +5 Intron GACTGATTTGTGGAAAGGAG G GGGGAAGAGGGAGAAGGATC6063 G > A 757 A +6 Intron GTCTAGTGTATTCTCTTCAC A GTGCCAGGAAAGAGTGGTTT6064 A > G 757 A −1 Intron CCGAGCCGGGGGCGCTGTGC G CAGCGCTCGGGCCAGGCCGG6065 G > A 757 A 2 Exon TTGCACGAGTTCGCGCCGCT G GTGGAGTACGGCTGCCACGG 6066G > C 757 A 4 Exon CTCACCTTCCTCATCGACCC G GCCCGCTTCCGCTACCCCGA 6067 G >C 757 A 5 Exon AGCCGGAGAAAACCGGCCAG C GTGATCACCAGCGGTGGGAT 6068 C > T

Example 10 Allele Specific Assay

Once variants were confirmed by sequencing, rapid allele specific assayswere designed to type more than 400 individuals (>200 cases and >200controls) for use in the association studies. All coding SNPs (cSNPs)that resulted in an amino acid change were typed. Neutral polymorphismswere typed if: 1) the polymorphism was present in an exon lacking acSNP; 2) the polymorphism was present in an exon containing a cSNP, butthe two polymorphisms were observed to have different frequencies; or 3)the polymorphism was in an intronic region adjacent to an exon without acSNP. If results from the association studies appeared positive,additional neutral polymorphisms were typed.

Three types of allele specific assays (ASAs) were used. If the SNPresulted in a mutation that created or abolished a restriction site,RFLPs were obtained from PCR products that spanned the variants, andwere subsequently analyzed. If the polymorphism did not result in anRFLP, allele-specific oligonucleotide or exonuclease proofreading assayswere used. For the allele-specific oligonucleotide assays, PCR productsthat spanned the polymorphism were electrophoresed on agarose gels andtransferred to nylon membranes by Southern blotting. Oligomers 16-20 bpin length were designed such that the middle base was specific for eachvariant. The oligomers were labeled and successively hybridized to themembrane in order to determine genotypes.

Table 11A, below, shows the information for the ASAs. Column 1 lists theSNP names. Column 2 lists the specific assays used (RFLP or ASO). Column3 lists the enzymes used in the RFLP assay (described below). Columns 4and 6 list the sequences of the primers used in the ASO assay (describedbelow). Columns 5 and 7 list the corresponding SEQ ID NOs for theprimers. It should be noted that the disclosed SNPs are referred toherein using both short (e.g., 454_E_(—)2; see Table 11A) and long(e.g., Gene 454 E 2; see Examples 11-13) nomenclature.

TABLE 11A ASA PRIMERS SEQ SEQ ASA RFLP ID ID Base AA SNP Type EnzymeASO Primer1 NO: ASO Primer2 NO: change change 214_B_1 RFLP NdeII C > T214_C_−1 ASO ACCTCAACCCAGGCGTT 6069 CACCTCAACTCAGGCGTTTG 6080 C > T214_E_+1 RFLP PvuII T > C 214_E_+2 RFLP MspI G > C 214_E_−1 RFLP AvaIC > T 214_E_1 ASO CTCTCTCTGTGAGTGTCC 6070 CTCTCTCTTTGAGTGTCCTGG 6081 G >T Val > Leu 214_E_2 ASO ACCTCAACCCAGGCGTTT 6071 CCACCTCAACTCAGGCGTTT6082 C > T 214_E_3 ASO TGCTTGACCCCCAAATCC 6072 GTGCTTGACCTCCAAATCCG 6083C > T Pro > Ser 422_E_2 ASO TCATACTCGTGCTCTGGC 6073TATTCATACTCATGCTCTGGCT 6084 G > A 436_A_+2 ASO TGTTTTAAAGCCACAGCCT 6074CTGTTTTAAAACCACAGCCTGG 6085 G > A 436_A_1 ASO CCCTCGGTTCCCACCGTC 6075GCCATGGCGTGCTGCTGC 6086 G > T Gly > Cys 436_A_2 ASO GTGCAACTGCTCATCCTG6076 CGTGCAACTGTTCATCCTGG 6087 C > T Leu > Phe 436_C_+1 RFLP DraIII G >A 436_C_−1 RFLP MwoI C > T 436_D_1 RFLP DraIII G > A 436_E_1 RFLP AvaIIC > T 436_G_1 ASO GCAGGACACAGTTTCCAGGA 6077 CAGGACACGGTTTCCAG 6088 A > GSer > Gly 436_K_+1 RFLP AlwNI C > G 436_K_−2 ASO CTCAGGAAGGGGCACGCA 6078CTCAGGAACGGGCACGCA 6089 G > C 436_L_−1 ASO CTCCCTCCCGCCTGCCAC 6079CTCCCTCCTGCCTGCCAC 6090 C > T 436_L_−3 RFLP XmaI C > T 436_L_1 RFLP HhaIT > C 454_B_1 RFLP BstuI T > C Val > Ala 454_E_−1 RFLP PstI C > T454_E_1 RFLP HpaII G > A Gly > Arg 454_E_2 RFLP NlaIII C > T His > Tyr454_E_3 RFLP BanI C > T 454_F_−2 ASO CCCTCGGTCCCCACCGTC 6091CCCTCGGTTCCCACCGTC 6107 C > T 454_G_−1 RFLP BstYI A > T 454_H_1 ASOCATCACTGCCGTCCCAAA 6092 CCATCACTGCCATCCCAAAT 6108 G > A Arg > His454_H_2 ASO CAGTTTCCGTCGCCTTGA 6093 CAGTTTCCATCGCCTTGACG 6109 G > AArg > His 454_K_1 RFLP AlwNI G > A Arg > Gln 454_L_−1 RFLP EarI C > T454_M_+1 ASO CCAAGACACGGAGAGATT 6094 ACCAAGACATGGAGAGATTCC 6110 C > T454_M_1 RFLP MspAI A > G Ala > Thr 454_M_2 ASO CATCGACACTTACTCCAG 6095CATCGACAGTTACTCCAG 6111 C > G Thr > Ser 454_O_1 RFLP PvuII A > G Gln >Arg 454_O_3 RFLP HhaI A > C Glu > Ala 454_O_5 ASO ACTTTGCCATCCTGCCCAG6096 ACTTTGCCAACCTGCCCAG 6112 T > A Ile > Asn 454_O_6 RFLP MboII A > G515_A_1 ASO GCGGATCCGCATCTTCT 6097 TGCGGATCCACATCTTCTGG 6113 G > A Arg >His 515_A_2 ASO GGAAGATGACAGGTGAGC 6098 AGGAAGATAACAGGTGAGCC 6114 G > AMET > Ile 515_A_3 RFLP HaeIII G > C MET > Ile 515_A_4 RFLP Bsu36I G > AVal > Ile 515_A_5 RFLP BsmI C > G Ile > MET 515_A_6 RFLP BsmI T > C515_A_7 RFLP XcmI A > G 561_B_+1 ASO AGAATGGCCGTTGGCTG 6099GAGAATGGCTGTTGGCTGC 6115 C > T 561_B_1 ASO CACGCTGGCAAGATTGAC 6100CACGCTGGGAAGATTGAC 6116 C > G 561_C_1 RFLP MwoI G > A 561_E_+1 RFLP MspIT > C 561_E_1 ASO CCTCACGGCGGGAAAAT 6101 CCCTCACGGTGGGAAAATAC 6117 C > TAla > Val 561_H_1 ASO CATCCTGCCACAGCCACAG 6102 ATCCTGCCGCAGCCACA 6118A > G 561_J_1 ASO GGAAGGAGGCGGCCCA 6103 GGGAAGGAGACGGCCCAG 6119 G > A561_M_+1 ASO CGCAAACCCCTATTCGAC 6104 GCGCAAACCACTATTCGACC 6120 C > A561_P_1 ASO GAGCTGAACGGGCAGAA 6105 GGAGCTGAATGGGCAGAAAG 6121 T > C Arg >Trp 561_X_−3 ASO ATTTTCAGGCATCAAGTTCTTTC 6106 ATTTTCAGGCAACAAGTTCTTTCT6122 T > A 561_Y_+1 RFLP BsrBI C > G 561_Y_−1 RFLP Fnu4HI G > C 570_C_1RFLP MspI C > T Pro > Leu 570_C_2 ASO GCTTTTGTGGTCTTCCTCTG 6123CTTTTGTGGCCTTCCTCT 6135 T > C Val > Ala 570_C_3 ASO CTTTCACAGACATGTTCAAG6124 GCTTTCACAGATATGTTCAAGA 6136 G > A Gly > Ser 570_C_4 RFLP AflIII C >T 570_F_1 RFLP DdeI G > A 581_F_+2 ASO AGATTGTTCGGTTTGGCTT 6125TCAGATTGTTCTGTTTGGCTTG 6137 G > T 698_E_1 ASO CATGCAAATGAGAAAAGCAAT 6126GGCATGCAAATGAAAAAAGCAAT 6138 G > A Arg > Lys 698_I_+1 ASOCCCCACAGGTGACTAACCTT 6127 CCCACAGGCGACTAACC 6139 A > G 702_A_−1  ASOACTTTTCCGGCAGCTGC 6128 ACTTTTCCGTCAGCTGCCC 6140 G > A 702_B_+1 ASOAGGTGGGTGTGTGGCCAG 6129 GGTGGGTGCGTGGCCA 6141 T > C 702_B_+3 ASOAGGGTGAGGAACGGGGT 6130  AGGGTGAGCAACGGGGT 6142 G > C 702_C_1 RFLP HaeIIG > A Ala > Thr 702_D_1 RFLP HhaI C > T Arg > His 702_F_1 RFLP NciII G >C Arg > Pro 702_I_1 RFLP XcmI G > A Asp > Asn 702_I_3 RFLP DpnII G > AVal > Ile 722_C_1 ASO GATTAGAAACAGTATTGATAAA 6131 GATTAGAAACACTATTGATAAA6143 G > C Ser > Thr 722_F_+1 RFLP Tth111 G > A 722_G_−1 ASOAACAACTATGATTGTAGTTGCTA 6132 CAACTATGACTGTAGTTGC 6144 T > C 757_A_+4RFLP HpyCH4IV G > C 757_A_−1 ASO GCTGTGCGCAGCGCTC 6133CGCTGTGCACAGCGCTCG 6145 G > A 757_A_2 ASO GCGCCGCTGGTGGAGTA 6134GCGCCGCTCGTGGAGTA 6146 G > C 757_A_4 RFLP Sau96I G > C 757_A_5 RFLPCac8I C > T

1. RFLP Assay:

The amplicon containing the polymorphism was PCR amplified using primersthat generated fragments for sequencing (sequencing primers) or SSCP(SSCP primers). The appropriate population of individuals was PCRamplified in 96-well microtiter plates. Enzymes were purchased from NEB.The restriction cocktail containing the appropriate enzyme for theparticular polymorphism was added to the PCR product. The reaction wasincubated at the appropriate temperature according to the manufacturer'srecommendations for 2-3 hr, followed by a 4° C. incubation. Afterdigestion, the reactions were size fractionated using the appropriateagarose gel depending on the assay specifications (2.5%, 3%, orMetaphor, FMC Bioproducts). Gels were electrophoresed in 1×TBE buffer at170 V for approximately 2 hr. The gel was illuminated using UV, and theimage was saved as a Kodak 1D file. Using the Kodak 1D image analysissoftware, the images were scored and the data was exported to Microsoft®Excel (Microsoft Corp.; Redmond, Wash.).

2. ASO Assay:

The amplicon containing the polymorphism was PCR amplified using primersthat generated fragments for sequencing (sequencing primers) or SSCP(SSCP primers). The appropriate population of individuals was PCRamplified in 96-well microtiter plates and re-arrayed into 384-wellmicrotiter plates using a Tecan Genesis RSP200. The amplified productswere loaded onto 2% agarose gels and size fractionated at 150V for 5min. The DNA was transferred from the gel to Hybond N+ nylon membrane(Amersham-Pharmacia) using a Vacuum blotter (Bio-Rad). The filtercontaining the blotted PCR products was transferred to a dish containing300 ml pre-hybridization solution (5×SSPE (pH 7.4), 2% SDS,5×Denhardt's). The filter was incubated in pre-hybridization solution at40° C. for over 1 hr. After pre-hybridization, 10 ml of thepre-hybridization solution and the filter were transferred to a washedglass bottle. The allele-specific oligonucleotides (ASO) were designedto contain the polymorphism in the middle of the nucleotide sequence.The size of the oligonucleotide was dependent upon the GC content of thesequence around the polymorphism. Those ASOs that had a G or Cpolymorphism were designed so that the T_(m) was between 54-56° C. ThoseASOs that had an A or T polymorphism were designed so that the T_(m) wasbetween 60-64° C. All oligonucleotides were phosphate-free at the 5′ends and purchased from GibcoBRL. For each polymorphism, 2 ASOs weredesigned to yield one ASO for each strand.

The ASOs that represented each polymorphism were resuspended at aconcentration of 1 μg/μl. Each ASO was end-labeled with γ-ATP³² (6000Ci/mmol) (NEN) using T4 polynucleotide kinase according to manufacturerrecommendations (NEB). The end-labeled products were removed from theunincorporated γ-ATP³² using a Sephadex G-25 column according to themanufacturer's instructions (Amersham-Pharmacia). The entire end-labeledproduct of one ASO was added to the bottle containing the appropriatefilter and 10 ml hybridization solution. The hybridization reaction wasplaced in a rotisserie oven (Hybaid) and left at 40° C. for a minimum of4 hr. The other ASO was stored at −20° C.

After the prerequisite hybridization time had elapsed, the filter wasremoved from the bottle and transferred to 1 L of wash solution(0.1×SSPE (pH 7.4) and 0.1% SDS) pre-warmed to 45° C. After 15 min, thefilter was transferred to another liter of wash solution (0.1×SSPE (pH7.4) and 0.1% SDS) pre-warmed to 50° C. After 15 min, the filter waswrapped in Saran Wrap®, placed in an autoradiograph cassette, and anX-ray film (Kodak) was placed on top of the filter. Typically, an imagewas visible within 1 hr. After an image was captured on film followingthe 50° C. wash, images were captured following wash steps at 55° C.,60° C. and 65° C. The best image was selected.

The ASO was removed from the filter by adding 1 L of boiling stripsolution (0.1×SSPE (pH 7.4) and 0.1% SDS). This was repeated two moretimes. After removing the ASO, the filter was pre-hybridized in 300 mlpre-hybridization solution (5×SSPE (pH 7.4), 2% SDS, and 5×Denhardt's)at 40° C. for over 1 hr. The second end-labeled ASO corresponding to theother strand was removed from storage at −20° C. and thawed at RT. Thefilter was placed into a glass bottle along with 10 ml hybridizationsolution and the entire end-labeled product of the second ASO. Thehybridization reaction was placed in a rotisserie oven (Hybaid Limited;United Kingdom) and left at 40° C. for a minimum of 4 hr. After thehybridization, the filter was washed at various temperatures and imagescaptured on film as described above. The best image for each ASO wasconverted into a digital image by scanning the film into Adobe®Photoshop®. These images were overlaid using Graphic Converter, and theoverlaid images were scored.

3. Exonuclease Proofreading Assay:

Exonuclease Proofreading Assays (EPAs) were also employed (see U.S. Pat.No. 5,391,480). Briefly, primers corresponding to the polymorphisms ofinterest were designed to contain fluorescent tags at the 3′ ends. Theprimers were designed such that the 3′ ends contained the variant orconsensus nucleotides. Mismatched bases at the 3′ ends were removed byan exonuclease proofreading enzyme (Pwo DNA polymerase; Roche, Germany;Cat. No. 1-644-855) in the PCR reaction. Where bases were matched, theresulting PCR products contained the tagged bases. The tagged bases weredetected by gel electrophoresis or florescent polarization. Examples ofprimers used for EPA analysis of Gene 436, Gene 454, Gene 570, and Gene698 are shown in the Table 11B, below.

TABLE 11B EPA PRIMERS SEQ Primer Seq. ID SNP (5′-3′) NO: 436_K_−2TTATTCTTTGCGTGCCC 6147 436_K_−2 ACCTTCCCTTCTCCAAGACC 6148 436_K_−2ATTCCAGGCTTCTCAGGAA 6149 436_K_−2 CGCCTGAGTTTAGCATAGGG 6150 454_F_−2CATGGGCTCCCTCGGT 6151 454_F_−2 CCGGGGAAGTCGATATTGTT 6152 454_F_−2CATGGGCTCCCTCGGT 6153 570_C_2 GCGGTCTTGCTTTTGTGG 6154 570_C_2TTACTCTGGCGCTCTCCACT 6155 570_C_2 CGGTCTTGCTTTTGTGG 6156 698_I_+1AGAATGGCAACCCCACAGG 6157 698_I_+1 GCTGGTTCTCACGCTGCATATTT 6158 698_I_+1GTAGAATGGCAACCCCACAGG 6159

Example 11 Association Study Analysis

1. Case-Control Study:

In order to determine whether polymorphisms in candidate genes wereassociated with the asthma phenotype, association studies were performedusing a case-control design. In a well-matched design, the case-controlapproach is more powerful than the family based transmissiondisequilibrium test (TDT) (N. E. Morton and A. Collins, 1998, Proc.Natl. Acad. Sci. USA 95:11389-93). Case-control studies are, however,sensitive to population admixture.

To avoid issues of population admixture, which can bias case-controlstudies, unaffected controls were collected in both the US and the UK. Atotal of three hundred controls were collected, 200 in the UK and 100 inthe US. Inclusion into the study required that the control individualwas 1) negative for asthma (as determined by self-report of never havingasthma); 2) had no first degree relatives with asthma; and 3) wasnegative for eczema and symptoms indicative of atopy for the past 12months. Data from an abbreviated questionnaire similar to thatadministered to the affected sib pair families were collected. Resultsfrom skin prick tests to 4 common allergens were also collected. Theresults of the skin prick tests were used to select a subset of controlsthat were most likely to be asthma and atopy negative.

A subset of unrelated cases was selected from the affected sib pairfamilies based on the evidence for linkage at the chromosomal locationnear a given gene. One affected sib demonstrating identity-by-descent(IBD) at the appropriate marker loci was selected from each family. Asthe appropriate cases may vary for each gene in the region, a largercollection of individuals who were IBD across a larger interval wasgenotyped. A subset of this collection was used in the analyses. Onaverage, 115 IBD affected individuals and 200 controls were compared forallele and genotype frequencies. This number provided 80% power todetect a difference of 5% or greater between the two groups for a rareallele 5%) at a 0.05 level of significance. For a common allele (50%),the number provided 80% power to detect a difference of 10% or morebetween the two groups.

For each polymorphism, the frequency of the alleles in the control andcase populations was compared using a Fisher's exact test. A mutationthat increased susceptibility to the disease was expected to be moreprevalent in the cases than in the controls, while a protective mutationwas expected to be more prevalent in the control group. Similarly, thegenotype frequencies of the SNPs were compared between cases andcontrols. P-values for the allele test were plotted against a coordinatesystem based on genomic sequence to visualize regions where allelicassociation was present. A small p-value (or a large value of −log (p)as plotted in the figures described below) was deemed indicative of anassociation between the SNPs and the disease phenotype. The analysis wasrepeated for the US and UK populations, separately, to correct forgenetic heterogeneity.

2. Association Test with Individual SNPs:

Chromosomal regions harboring asthma susceptibility genes wereidentified by association studies using the SNP typing data. Fourseparate phenotypes were used in these analyses: asthma, bronchialhyper-responsiveness, total IgE, and specific IgE.

a. Asthma Phenotype:

A coordinate system was developed based on available genomic sequence,and was used to plot significance values of SNPs and haplotypesaccording to their relative location along the chromosome (as shown inFIGS. 11-26). Overlapping genomic sequences were assembled to provide aframework for estimation of relative physical distance between SNPs.Where necessary, gaps were introduced to provide contiguity.

The significance levels (p-values) for allelic association of all typedSNPs to the asthma phenotype are plotted in FIG. 11 (combinedpopulation) and FIG. 12 (US and UK populations, separately). Frequenciesand p-values for SNPs associated with the asthma phenotype are shown inTables 12A, 12B, and 12C for the combined population and for the UK andUS populations, separately. Column 1 lists the SNP names, which werederived from the gene numbers and closest exons. Columns 2 and 3 listthe control (“CNTL”) allele frequencies and sample sizes (“N”),respectively. Columns 4 and 5 list the affected individuals (“CASE”)allele frequencies and sample sizes (“N”), respectively. Columns 6 and 7list the p-value for the comparison between the case and control alleleand genotype frequencies, respectively.

SNPs in Gene 454, Gene 757, Gene 561, and Gene 214 showed a significantassociation with the asthma phenotype in the combined population, whencomparing the allele frequency in the case and control groups (Table12A). When analyzing the population separately, SNPs in Gene 454, Gene757, Gene 698 and Gene 561 showed a significant association in the UKpopulation alone (Table 12B), while SNPs in Gene 454 and Gene 561 showeda similar association with the phenotype in the US population (Table12C). Additional significant results emerged when comparing the genotypefrequency of the control and case groups. SNPs in Gene 436 in thecombined population, and in Gene 515 and Gene 570 in the US population,also reached statistical significance.

Seven SNPs in Gene 454 showed allelic frequencies significantlydifferent in the cases versus the controls in the combined population.Two SNPs in exon O were more frequent in the controls (19% and 42%,respectively) than in the cases (12% and 33%, respectively). Thesedifferences were statistically significant (p=0.03 and p=0.02), withsimilar p-values obtained for the genotype comparison (p=0.03 andp=0.01). The first SNP also reached statistical significance in the UKpopulation (p=0.03), while the second SNP had a significant genotypep-value (p=0.02) in the UK population. The first SNP in exon O resultsin an amino acid change of glutamine to arginine. In addition, one SNPin exon M, and one just outside exon M, reached statistical significancein both the US and combined population. These two SNPs showed highlinkage disequilibrium and had similar allele frequencies. For theexonic SNP, the p-value was 0.01 for the combined population, and theallele frequencies were 42% in controls versus 32% in cases. The p-valuewas 0.02 for the US sample, and the allele frequencies were 41% incontrols versus 23% of cases. The genotype comparison was significant(p=0.02) in the combined population. The intronic SNP showedsignificance for both the allele and genotype tests in the combinedpopulation (p=0.007), for the allele comparison in the US (p=0.02), andfor the genotype comparison in the UK (p=0.03).

Three other SNPs reach statistical significance in the combinedpopulation, and in the US or UK populations, alone: 1) SNP E 2, whichresults in a histidine to tyrosine amino acid change (allele frequenciesof 49% in controls and 60% in cases, p=0.006 and p=0.007 for the alleleand genotype test respectively, in the combined population; p=0.02 forboth allele and genotype tests in the UK population, allele frequenciesof 51% in controls and 63% in cases); 2) SNP H 1, an arginine tohistidine amino acid change, (p=0.003 for the allele and p=0.002 for thegenotype tests in the combined population, allele frequencies of 22% inthe controls and 33% in the cases; p=0.04 in the UK population, allelefrequencies of 23% in controls and 32% in cases; p=0.03 for the alleleand p=0.02 for the genotype test in the US population, allelefrequencies of 20% in controls and 36% in cases); and 3) intronic SNP F−2 (p=0.008 for the allele and p=0.003 for the genotype tests in thecombined population, allele frequencies of 35% in controls and 24% incases; p=0.04 for the allele and p=0.03 for the genotype tests in USpopulation, allele frequencies of 37% in cases and 21% in controls).

One SNP in Gene 757 reached statistical significance for the allele testin the combined and UK populations (SNP A 2; p=0.03 in the combinedpopulation, allele frequencies of 18% in controls and 26% in cases;p=0.03 in the UK sample, allele frequencies of 17% in controls and 26%in cases). Another SNP in the same Exon (A 4) reached statisticalsignificance for the genotype test in the combined population (p<0.05).

Multiple SNPs in Gene 561 reached statistical significance in either thecombined population or in the US or UK populations, separately. SNP J 1was significant in the combined population (p=0.04, allele frequenciesof 15% in controls and 9% in cases); SNP Y +1 was significant in the UKsample (p=0.002, allele frequencies of 5% in controls and not present incases); and SNP H 1 was significant in the US population (p=0.02, allelefrequencies of 10% in controls and 25% in cases). SNP H 1 also showed asignificant genotype p-value in the combined population (p=0.03) and inthe US population (p=0.01), while SNP Y +1 showed a significant genotypep-value (p=0.001) in the UK population. None of these SNPs resulted inamino acid changes.

A single SNP in Gene 214 reached statistical significance in thecombined population (p=0.04, allele frequencies of 28% in controls and36% in cases).

For Gene 436, one SNP (E 1) showed a significant genotype p-value in thecombined population (p=0.04).

One SNP in Gene 698 (E 1) reached statistical significance in the UKpopulation (p=0.01 for the allele test, p=0.02 for the genotype test,allele frequencies of 5% in controls and 12% in cases). This SNP resultsin an arginine to lysine amino acid change.

SNPs in two genes, Gene 515 and Gene 570, showed significant genotypep-values in the US population alone (515 A 1, p=0.007; 515 A 2, p=0.005;515 A 4; p=0.001; 570 F 1, p=0.007).

TABLE 12A ASSOCIATION ANALYSIS OF ASTHMA PHENOTYPE COMBINED US/UKPOPULATION Combined US and UK GENO- FREQUENCIES ALLELE TYPE GENE_EXONCNTL N CASE N P-VALUE P-VALUE 454_B_1 7.1% 183 4.1% 98 0.1939 0.1793454_E_−1 25.8% 213 27.0% 113 0.7792 0.9290 454_E_1 0.8% 179 2.7% 1120.0939 0.2361 454_E_2 49.1% 217 60.3% 117 0.0058 0.0070 454_F_−2 34.6%211 24.3% 115 0.0078 0.0030 454_G_−1 8.4% 215 9.7% 113 0.5651 0.4447454_H_1 22.1% 204 33.0% 112 0.0032 0.0022 454_H_2 2.0% 198 2.6% 1150.7801 0.7776 454_K_1 1.9% 215 2.6% 114 0.5738 0.5698 454_L_−1 6.7% 2175.5% 109 0.6119 0.9390 454_M_1 42.3% 208 31.6% 98 0.0128 0.0197 454_M_26.8% 212 6.3% 111 0.8691 1.0000 454_M_+1 43.2% 212 32.3% 113 0.00710.0069 454_O_1 19.1% 215 12.2% 107 0.0330 0.0287 454_O_3 17.6% 216 19.2%112 0.6692 0.1280 454_O_5 3.0% 215 1.9% 106 0.6018 0.5968 454_O_6 42.4%198 32.9% 111 0.0205 0.0138 436_A_1 1.2% 203 1.9% 106 0.5013 0.3944436_C_−1 15.3% 216 13.6% 114 0.6440 0.6479 436_D_1 5.0% 212 3.5% 1140.4336 0.7884 436_E_1 46.3% 214 40.6% 112 0.1844 0.0382 436_G_1 10.4%212 10.6% 109 1.0000 0.5823 436_K_−2 14.9% 204 10.4% 111 0.1121 0.1078436_K_+1 4.7% 200 3.6% 112 0.5448 0.8901 436_L_−1 4.4% 217 7.8% 1150.0757 0.0674 436_L_1 1.9% 214 1.8% 113 1.0000 1.0000 515_A_1 43.1% 21142.0% 106 0.7990 0.2609 515_A_2 37.2% 211 35.2% 105 0.6615 0.2919515_A_3 7.4% 217 6.6% 113 0.8734 0.8683 515_A_4 43.5% 208 42.1% 1080.7995 0.1089 515_A_5 4.1% 207 2.3% 108 0.3603 0.3514 515_A_7 2.4% 2131.4% 110 0.5583 0.5540 570_C_2 9.5% 215 7.7% 91 0.5379 0.8353 570_C_49.5% 217 7.2% 90 0.4358 0.7645 570_F_1 47.9% 215 50.6% 89 0.5929 0.5071757_A_2 18.1% 210 26.3% 99 0.0252 0.0756 757_A_4 1.6% 218 4.6% 98 0.05040.0476 757_A_+4 39.4% 216 37.0% 104 0.6034 0.4277 698_E_1 6.9% 217 10.9%105 0.0925 0.1182 698_I_+1 32.5% 209 31.9% 102 0.9273 0.8968 561_P_134.0% 209 39.4% 104 0.1854 0.4550 561_J_1 14.9% 208 9.0% 105 0.04350.1090 561_H_1 9.0% 212 13.7% 102 0.0722 0.0339 561_E_1 0.0% 178 0.6% 870.3283 0.3283 561_C_1 0.2% 217 1.0% 104 0.2465 0.2462 561_B_+1 13.4% 21216.1% 90 0.4450 0.6122 561_B_1 48.3% 210 47.2% 90 0.8585 0.9707 561_Y_+14.7% 212 1.5% 100 0.0658 0.0607 561_X_−3 31.3% 214 30.9% 105 1.00000.4464 581_F_+2 24.3% 216 21.8% 110 0.4956 0.7746 722_C_1 33.0% 20930.2% 111 0.4779 0.8190 722_F_+1 1.4% 217 1.4% 111 1.0000 1.0000702_A_−1 7.0% 215 6.9% 109 1.0000 0.9377 702_C_1 49.3% 213 46.4% 1110.5081 0.1817 702_D_1 16.1% 217 14.0% 111 0.4948 0.7583 702_F_1 3.7% 2174.5% 111 0.6734 0.6706 702_I_1 18.6% 204 20.8% 101 0.5159 0.5125 702_I_30.7% 217 0.4% 111 1.0000 1.0000 214_B_1 17.8% 214 20.3% 118 0.46660.1354 214_E_−1 48.8% 202 48.2% 110 0.9332 0.8832 214_E_+1 28.3% 21736.0% 118 0.0445 0.1073

TABLE 12B ASSOCIATION ANALYSIS OF ASTHMA PHENOTYPE UK POPULATION UKpopulation FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE N P-VALUEP-VALUE 454_B_1 7.3% 109 5.0% 70 0.5085 0.4932 454_E_−1 25.2% 137 29.8%84 0.3206 0.5120 454_E_1 1.0% 104 3.7% 82 0.1456 0.2293 454_E_2 51.4%140 62.6% 87 0.0200 0.0163 454_F_−2 33.2% 137 25.3% 87 0.0908 0.1154454_G_−1 8.3% 138 11.3% 84 0.3185 0.3097 454_H_1 23.0% 135 31.9% 830.0441 0.0674 454_H_2 2.3% 131 2.3% 86 1.0000 1.0000 454_K_1 2.5% 1383.0% 84 0.7711 0.7684 454_L_−1 5.7% 140 5.0% 80 0.8301 0.8286 454_M_142.9% 133 34.7% 72 0.1144 0.1532 454_M_2 5.9% 136 6.1% 82 1.0000 1.0000454_M_+1 44.1% 136 35.1% 84 0.0722 0.0317 454_O_1 19.2% 138 11.0% 770.0295 0.0695 454_O_3 16.9% 139 18.3% 82 0.6994 0.3720 454_O_5 2.5% 1391.9% 78 1.0000 1.0000 454_O_6 44.4% 124 35.4% 82 0.0816 0.0167 436_A_11.8% 135 1.9% 78 1.0000 0.7918 436_C_−1 13.7% 139 14.3% 84 0.8881 0.8219436_D_1 3.7% 136 4.2% 84 0.8033 0.7996 436_E_1 45.3% 137 41.0% 83 0.42770.0947 436_G_1 9.8% 138 10.0% 80 1.0000 0.8893 436_K_−2 13.0% 131 9.8%82 0.3556 0.5213 436_K_+1 2.8% 123 4.3% 82 0.5802 0.5733 436_L_−1 4.6%140 8.2% 85 0.1513 0.1382 436_L_1 1.5% 137 1.8% 83 1.0000 1.0000 515_A_145.9% 135 42.2% 77 0.4783 0.7647 515_A_2 39.6% 135 35.5% 76 0.46510.7352 515_A_3 7.5% 140 8.4% 83 0.7192 0.7076 515_A_4 45.6% 136 42.8% 830.6204 0.7849 515_A_5 3.4% 133 1.8% 81 0.5477 0.5417 515_A_7 1.8% 1381.8% 82 1.0000 1.0000 570_C_2 8.7% 138 9.6% 68 0.8548 0.9286 570_C_48.6% 140 9.0% 67 1.0000 0.9266 570_F_1 49.3% 138 49.3% 67 1.0000 0.7395757_A_2 17.2% 137 26.0% 75 0.0325 0.0965 757_A_4 1.8% 140 4.7% 74 0.11980.1147 757_A_+4 40.3% 139 36.9% 80 0.5418 0.3027 698_E_1 5.4% 140 12.3%81 0.0106 0.0174 698_I_+1 38.4% 133 35.3% 78 0.5336 0.8307 561_P_1 33.0%135 41.3% 80 0.0965 0.2473 561_J_1 13.3% 132 7.4% 81 0.0790 0.1202561_H_1 8.6% 139 10.6% 80 0.4995 0.2479 561_E_1 0.0% 110 0.0% 68 1.00001.0000 561_C_1 0.4% 140 0.6% 80 1.0000 1.0000 561_B_+1 15.0% 137 15.9%66 0.8830 0.8929 561_B_1 52.2% 135 47.0% 66 0.3404 0.6227 561_Y_+1 5.5%136 0.0% 78 0.0016 0.0013 561_X_−3 33.0% 138 32.1% 81 0.9160 0.2592581_F_+2 24.6% 140 22.7% 86 0.6515 0.8695 722_C_1 35.0% 133 30.5% 870.3523 0.5556 722_F_+1 1.8% 139 1.1% 87 0.7121 0.7098 702_A_−1 7.2% 1387.6% 86 1.0000 0.5211 702_C_1 47.8% 136 46.0% 87 0.7706 0.5005 702_D_117.9% 140 13.8% 87 0.2963 0.5003 702_F_1 2.9% 140 4.0% 87 0.5914 0.3738702_I_1 18.3% 131 20.8% 77 0.6065 0.4616 702_I_3 1.1% 140 0.6% 87 1.00001.0000 214_B_1 19.2% 138 20.8% 89 0.7181 0.1738 214_E_−1 47.9% 140 50.6%81 0.6218 0.8436 214_E_+1 30.7% 140 39.3% 89 0.0686 0.1558

TABLE 12C ASSOCIATION ANALYSIS OF ASTHMA PHENOTYPE US POPULATION USpopulation FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE N P-VALUEP-VALUE 454_B_1 6.8% 74 1.8% 28 0.2957 0.2815 454_E_−1 27.0% 76 19.0% 290.2844 0.3531 454_E_1 0.7% 75 0.0% 30 1.0000 1.0000 454_E_2 44.8% 7753.3% 30 0.2881 0.5554 454_F_−2 37.2% 74 21.4% 28 0.0443 0.0288 454_G_−18.4% 77 5.2% 29 0.5654 0.5481 454_H_1 20.3% 69 36.2% 29 0.0292 0.0199454_H_2 1.5% 67 3.5% 29 0.5856 0.5818 454_K_1 0.7% 77 1.7% 30 0.48310.4840 454_L_−1 8.4% 77 6.9% 29 1.0000 0.8698 454_M_1 41.3% 75 23.1% 260.0198 0.0792 454_M_2 8.6% 76 6.9% 29 0.7854 1.0000 454_M_+1 41.4% 7624.1% 29 0.0247 0.0815 454_O_1 18.8% 77 15.0% 30 0.5573 0.7174 454_O_318.8% 77 21.7% 30 0.7022 0.3960 454_O_5 4.0% 76 1.8% 28 0.6772 0.6713454_O_6 39.2% 74 25.9% 29 0.0772 0.2334 436_A_1 0.0% 68 1.8% 28 0.29170.2917 436_C_−1 18.2% 77 11.7% 30 0.3063 0.6276 436_D_1 7.2% 76 1.7% 300.1856 0.4802 436_E_1 48.0% 77 39.7% 29 0.2842 0.4465 436_G_1 11.5% 7412.1% 29 1.0000 0.6979 436_K_−2 18.5% 73 12.1% 29 0.3047 0.4804 436_K_+17.8% 77 1.7% 30 0.1170 0.3281 436_L_−1 3.9% 77 6.7% 30 0.4717 0.4616436_L_1 2.6% 77 1.7% 30 1.0000 1.0000 515_A_1 38.2% 76 41.4% 29 0.75200.0067 515_A_2 32.9% 76 34.5% 29 0.8705 0.0048 515_A_3 7.1% 77 1.7% 300.1858 0.1724 515_A_4 39.6% 72 40.0% 25 1.0000 0.0010 515_A_5 5.4% 743.7% 27 1.0000 1.0000 515_A_7 3.3% 75 0.0% 28 0.3262 0.3197 570_C_211.0% 77 2.2% 23 0.0794 0.2617 570_C_4 11.0% 77 2.2% 23 0.0794 0.3161570_F_1 45.5% 77 54.5% 22 0.3085 0.0071 757_A_2 19.9% 73 27.1% 24 0.31530.3078 757_A_4 1.3% 78 4.2% 24 0.2359 0.2347 757_A_+4 37.7% 77 37.5% 241.0000 1.0000 698_E_1 9.7% 77 6.2% 24 0.5723 1.0000 698_I_+1 22.4% 7620.8% 24 1.0000 0.8691 561_P_1 35.8% 74 33.3% 24 0.8623 0.9473 561_J_117.8% 76 14.6% 24 0.8257 1.0000 561_H_1 9.6% 73 25.0% 22 0.0192 0.0123561_E_1 0.0% 68 2.6% 19 0.2184 0.2184 561_C_1 0.0% 77 2.1% 24 0.23760.2376 561_B_+1 10.7% 75 16.7% 24 0.3098 0.2553 561_B_1 41.3% 75 47.9%24 0.5032 0.6426 561_Y_+1 3.3% 76 6.8% 22 0.3819 0.3735 561_X_−3 28.3%76 27.1% 24 1.0000 1.0000 581_F_+2 23.7% 76 18.8% 24 0.5554 0.8790722_C_1 29.6% 76 29.2% 24 1.0000 1.0000 722_F_+1 0.6% 78 2.1% 24 0.41610.4170 702_A_−1 6.5% 77 4.3% 23 0.7370 1.0000 702_C_1 52.0% 77 47.9% 240.7411 0.5438 702_D_1 13.0% 77 14.6% 24 0.8092 0.6925 702_F_1 5.2% 776.2% 24 0.7249 0.7199 702_I_1 19.2% 73 20.8% 24 0.8350 0.9140 702_I_30.0% 77 0.0% 24 1.0000 1.0000 214_B_1 15.1% 76 19.0% 29 0.5320 0.4893214_E_−1 50.8% 62 41.4% 29 0.2668 0.4424 214_E_+1 24.0% 77 25.9% 290.8581 0.7552

b. Bronchial Hyper-Responsiveness:

The analyses were repeated using asthmatic children with borderline tosevere BHR (PC₂₀≦16 mg/ml) or PC₂₀(16), as described in the LinkageAnalysis section. (Example 3). First, sibling pairs were identifiedwhere both sibs were affected and satisfied this new criteria. Of thesepairs, one sib was included in the case/control analyses if they showedevidence of linkage at the gene of interest. This phenotype was morerestrictive than the Asthma yes/no criteria; hence the number of casesincluded in the analyses was reduced approximately in half. Where thePC₂₀(16) subgroup represented a more genetically homogeneous sample, onecould expect an increase in the effect size compared to the one observedin the original set of cases. However, the reduction in sample sizecould result in estimates that were less accurate. This, in turn, couldobscure a trend in allele frequencies in the control group, the originalset of cases, and the PC₂₀(16) subgroup. In addition, the reduction insample size could induce a reduction in power (and increase in p-values)in spite of the larger effect size.

The significance levels (p-values) for allelic association of all typedSNPs to the BHR phenotype are plotted in FIG. 13 (combined population)and FIG. 14 (US and UK populations, separately). Frequencies andp-values for SNPs associated with the BHR phenotype are shown in Tables13A, 13B, and 13C for the combined population and for the UK and USpopulations separately.

TABLE 13A ASSOCIATION ANALYSIS OF BHR PHENOTYPE COMBINED US/UKPOPULATION Combined US and UK FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTLN CASE N P-VALUE P-VALUE 454_B_1 7.1% 183 4.3% 46 0.4798 0.4636 454_E_−125.8% 213 29.8% 52 0.4577 0.5705 454_E_1 0.8% 179 2.9% 51 0.1260 0.1246454_E_2 49.1% 217 63.6% 55 0.0074 0.0140 454_F_−2 34.6% 211 24.6% 550.0519 0.0312 454_G_−1 8.4% 215 12.5% 52 0.1890 0.1353 454_H_1 22.1% 20433.0% 53 0.0223 0.0181 454_H_2 2.0% 198 3.7% 54 0.2962 0.2923 454_K_11.9% 215 3.8% 53 0.2664 0.2621 454_L_−1 6.7% 217 6.0% 50 1.0000 0.8944454_M_1 42.3% 208 35.2% 44 0.2349 0.4893 454_M_2 6.8% 212 6.9% 51 1.00000.8043 454_M_+1 43.2% 212 35.9% 53 0.1873 0.3681 454_O_1 19.1% 215 12.8%47 0.1816 0.3182 454_O_3 17.6% 216 17.6% 51 1.0000 0.2734 454_O_5 3.0%215 2.1% 48 1.0000 1.0000 454_O_6 42.4% 198 36.3% 51 0.3100 0.5059436_A_1 1.2% 203 2.0% 49 0.6263 0.3975 436_C_−1 15.3% 216 12.3% 530.5402 0.7508 436_D_1 5.0% 212 0.9% 53 0.0971 0.1825 436_E_1 46.3% 21445.1% 51 0.9120 0.2363 436_G_1 10.4% 212 11.0% 50 0.8565 0.4689 436_K_−214.9% 204 12.3% 53 0.5378 0.5523 436_K_+1 4.7% 200 1.0% 52 0.0925 0.3121436_L_−1 4.4% 217 6.5% 54 0.3256 0.4368 436_L_1 1.9% 214 1.9% 53 1.00001.0000 515_A_1 43.1% 211 43.5% 46 1.0000 1.0000 515_A_2 37.2% 211 33.3%45 0.5472 0.7363 515_A_3 7.4% 217 4.7% 53 0.3976 0.3791 515_A_4 43.5%208 42.9% 49 1.0000 0.8566 515_A_5 4.1% 207 3.1% 48 1.0000 0.7745515_A_7 2.4% 213 2.0% 51 1.0000 1.0000 570_C_2 9.5% 215 9.5% 42 1.00000.7549 570_C_4 9.5% 217 9.8% 41 1.0000 0.9219 570_F_1 47.9% 215 51.2% 410.6303 0.7776 757_A_2 18.1% 210 29.1% 43 0.0260 0.0659 757_A_4 1.6% 2184.8% 42 0.0849 0.0826 757_A_+4 39.4% 216 38.9% 45 1.0000 0.7300 698_E_16.9% 217 9.8% 46 0.3791 0.4289 698_I_+1 32.5% 209 29.1% 43 0.6118 0.4244561_P_1 34.0% 209 31.5% 46 0.7151 0.9229 561_J_1 14.9% 208 7.6% 460.0661 0.2215 561_H_1 9.0% 212 11.1% 45 0.5495 0.2181 561_E_1 0.0% 1781.3% 38 0.1759 0.1759 561_C_1 0.2% 217 0.0% 46 1.0000 1.0000 561_B_+113.4% 212 14.5% 38 0.8557 0.2404 561_B_1 48.3% 210 47.4% 38 0.90130.6065 561_Y_+1 4.7% 212 0.0% 45 0.0329 0.0296 561_X_−3 31.3% 214 32.6%46 0.8057 0.2221 581_F_+2 24.3% 216 16.0% 50 0.0852 0.1792 722_C_1 33.0%209 27.0% 50 0.2827 0.1568 722_F_+1 1.4% 217 1.0% 50 1.0000 1.0000702_A_−1 7.0% 215 8.2% 49 0.6666 0.7105 702_C_1 49.3% 213 42.0% 500.2212 0.0884 702_D_1 16.1% 217 11.0% 50 0.2193 0.5230 702_F_1 3.7% 2175.0% 50 0.5674 0.4866 702_I_1 18.6% 204 19.1% 47 0.8842 0.0607 702_I_30.7% 217 1.0% 50 0.5648 0.5660 214_B_1 17.8% 214 24.0% 52 0.1630 0.0098214_E_−1 48.8% 202 44.9% 49 0.5014 0.4228 214_E_+1 28.3% 217 38.5% 520.0568 0.0629

TABLE 13B ASSOCIATION ANALYSIS OF BHR PHENOTYPE UK POPULATION UKpopulation FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE N P-VALUEP-VALUE 454_B_1 7.3% 109 5.6% 36 0.7903 0.7821 454_E_−1 25.2% 137 30.9%42 0.3234 0.2916 454_E_1 1.0% 104 3.8% 40 0.1331 0.1311 454_E_2 51.4%140 65.9% 44 0.0196 0.0428 454_F_−2 33.2% 137 25.0% 44 0.1858 0.2137454_G_−1 8.3% 138 13.1% 42 0.2033 0.1620 454_H_1 23.0% 135 33.3% 420.0629 0.0498 454_H_2 2.3% 131 4.7% 43 0.2708 0.2656 454_K_1 2.5% 1383.6% 42 0.7041 0.7005 454_L_−1 5.7% 140 5.0% 40 1.0000 1.0000 454_M_142.9% 133 37.5% 36 0.5008 0.7446 454_M_2 5.9% 136 6.1% 41 1.0000 0.8275454_M_+1 44.1% 136 38.1% 42 0.3776 0.5479 454_O_1 19.2% 138 11.1% 360.1197 0.3349 454_O_3 16.9% 139 15.0% 40 0.7360 0.6730 454_O_5 2.5% 1391.3% 38 1.0000 1.0000 454_O_6 44.4% 124 37.8% 41 0.3068 0.4297 436_A_11.8% 135 2.6% 38 0.6510 0.4579 436_C_−1 13.7% 139 11.9% 42 0.8544 1.0000436_D_1 3.7% 136 1.2% 42 0.4697 0.4625 436_E_1 45.3% 137 46.3% 41 0.89980.2517 436_G_1 9.8% 138 10.3% 39 0.8334 0.5314 436_K_−2 13.0% 131 11.9%42 1.0000 1.0000 436_K_+1 2.8% 123 1.2% 41 0.6847 0.6807 436_L_−1 4.6%140 8.1% 43 0.2747 0.2611 436_L_1 1.5% 137 2.4% 42 0.6283 0.6262 515_A_145.9% 135 43.1% 36 0.6912 0.5660 515_A_2 39.6% 135 32.9% 35 0.33550.5485 515_A_3 7.5% 140 5.9% 42 0.8101 0.8024 515_A_4 45.6% 136 43.9% 410.8016 0.9215 515_A_5 3.4% 133 2.5% 40 1.0000 1.0000 515_A_7 1.8% 1382.4% 41 0.6619 0.6604 570_C_2 8.7% 138 10.3% 34 0.6413 0.5472 570_C_48.6% 140 10.6% 33 0.6321 0.4646 570_F_1 49.3% 138 48.5% 33 1.0000 0.1921757_A_2 17.2% 137 27.1% 35 0.0632 0.1182 757_A_4 1.8% 140 4.4% 34 0.19100.1890 757_A_+4 40.3% 139 40.5% 37 1.0000 0.7080 698_E_1 5.4% 140 10.5%38 0.1161 0.1501 698_I_+1 38.4% 133 32.9% 35 0.4873 0.4925 561_P_1 33.0%135 34.2% 38 0.8906 1.0000 561_J_1 13.3% 132 5.3% 38 0.0650 0.1740561_H_1 8.6% 139 8.1% 37 1.0000 1.0000 561_E_1 0.0% 110 0.0% 32 1.00001.0000 561_C_1 0.4% 140 0.0% 38 1.0000 1.0000 561_B_+1 15.0% 137 13.3%30 0.8425 0.6278 561_B_1 52.2% 135 45.0% 30 0.3217 0.3440 561_Y_+1 5.5%136 0.0% 38 0.0486 0.0437 561_X_−3 33.0% 138 35.5% 38 0.6825 0.0827581_F_+2 24.6% 140 18.6% 43 0.3068 0.3581 722_C_1 35.0% 133 27.9% 430.2393 0.2426 722_F_+1 1.8% 139 1.2% 43 1.0000 1.0000 702_A_−1 7.2% 1388.3% 42 0.8130 0.8054 702_C_1 47.8% 136 41.9% 43 0.3856 0.3789 702_D_117.9% 140 10.5% 43 0.1308 0.2813 702_F_1 2.9% 140 4.7% 43 0.4871 0.4261702_I_1 18.3% 131 18.8% 40 1.0000 0.1367 702_I_3 1.1% 140 1.2% 43 1.00001.0000 214_B_1 19.2% 138 25.6% 43 0.2235 0.0237 214_E_−1 47.9% 140 45.0%40 0.7039 0.6187 214_E_+1 30.7% 140 40.7% 43 0.0900 0.1646

TABLE 13C ASSOCIATION ANALYSIS OF BHR PHENOTYPE US POPULATION USpopulation FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE N P-VALUEP-VALUE 454_B_1 6.8% 74 0.0% 10 0.6099 0.5997 454_E_−1 27.0% 76 25.0% 101.0000 0.8773 454_E_1 0.7% 75 0.0% 11 1.0000 1.0000 454_E_2 44.8% 7754.5% 11 0.4939 0.6511 454_F_−2 37.2% 74 22.7% 11 0.2356 0.3215 454_G_−18.4% 77 10.0% 10 0.6842 0.6809 454_H_1 20.3% 69 31.8% 11 0.2665 0.2593454_H_2 1.5% 67 0.0% 11 1.0000 1.0000 454_K_1 0.7% 77 4.5% 11 0.23500.2356 454_L_−1 8.4% 77 10.0% 10 0.6842 0.6961 454_M_1 41.3% 75 25.0% 80.2845 0.5385 454_M_2 8.6% 76 10.0% 10 0.6875 0.6984 454_M_+1 41.4% 7627.3% 11 0.2483 0.5213 454_O_1 18.8% 77 18.2% 11 1.0000 0.8406 454_O_318.8% 77 27.3% 11 0.3924 0.2974 454_O_5 4.0% 76 5.0% 10 0.5860 1.0000454_O_6 39.2% 74 30.0% 10 0.4733 0.8144 436_A_1 0.0% 68 0.0% 11 1.00001.0000 436_C_−1 18.2% 77 13.6% 11 0.7696 1.0000 436_D_1 7.2% 76 0.0% 110.3630 0.6475 436_E_1 48.0% 77 40.0% 10 0.6353 0.8165 436_G_1 11.5% 7413.6% 11 0.7268 0.5735 436_K_−2 18.5% 73 13.6% 11 0.7686 1.0000 436_K_+17.8% 77 0.0% 11 0.3665 0.4304 436_L_−1 3.9% 77 0.0% 11 1.0000 1.0000436_L_1 2.6% 77 0.0% 11 1.0000 1.0000 515_A_1 38.2% 76 45.0% 10 0.62830.2911 515_A_2 32.9% 76 35.0% 10 1.0000 0.1469 515_A_3 7.1% 77 0.0% 110.3634 0.3458 515_A_4 39.6% 72 37.5% 8 1.0000 0.1886 515_A_5 5.4% 746.2% 8 1.0000 1.0000 515_A_7 3.3% 75 0.0% 10 1.0000 1.0000 570_C_2 11.0%77 6.2% 8 1.0000 1.0000 570_C_4 11.0% 77 6.2% 8 1.0000 1.0000 570_F_145.5% 77 62.5% 8 0.2925 0.0453 757_A_2 19.9% 73 37.5% 8 0.1158 0.2069757_A_4 1.3% 78 6.2% 8 0.2553 0.2566 757_A_+4 37.7% 77 31.3% 8 0.78731.0000 698_E_1 9.7% 77 6.2% 8 1.0000 1.0000 698_I_+1 22.4% 76 12.5% 80.5269 1.0000 561_P_1 35.8% 74 18.8% 8 0.2668 0.6666 561_J_1 17.8% 7618.8% 8 1.0000 0.7579 561_H_1 9.6% 73 25.0% 8 0.0826 0.0677 561_E_1 0.0%68 8.3% 6 0.0811 0.0811 561_C_1 0.0% 77 0.0% 8 1.0000 1.0000 561_B_+110.7% 75 18.8% 8 0.3996 0.1039 561_B_1 41.3% 75 56.3% 8 0.2937 0.4467561_Y_+1 3.3% 76 0.0% 7 1.0000 1.0000 561_X_−3 28.3% 76 18.8% 8 0.56101.0000 581_F_+2 23.7% 76 0.0% 7 0.0416 0.2351 722_C_1 29.6% 76 21.4% 70.7592 1.0000 722_F_+1 0.6% 78 0.0% 7 1.0000 1.0000 702_A_−1 6.5% 777.1% 7 1.0000 0.6027 702_C_1 52.0% 77 42.9% 7 0.5841 0.1734 702_D_113.0% 77 14.3% 7 1.0000 0.7000 702_F_1 5.2% 77 7.1% 7 0.5522 0.5618702_I_1 19.2% 73 21.4% 7 0.7357 0.3487 702_I_3 0.0% 77 0.0% 7 1.00001.0000 214_B_1 15.1% 76 16.7% 9 0.7414 1.0000 214_E_−1 50.8% 62 44.4% 90.8016 0.7277 214_E_+1 24.0% 77 27.8% 9 0.7732 0.4601

The results for the BHR sub-phenotype closely resembled the resultsobserved for the asthma phenotype, described above. Namely, SNPs in Gene454, Gene 757, and Gene 561 showed a significant association with theBHR phenotype in the combined population when comparing allelefrequencies in the control and case populations. When analyzing thepopulations separately, SNPs in Gene 454 and Gene 561 showed asignificant association in the UK population alone, while SNPs in Gene581 showed a similar association with the phenotype in the USpopulation. In addition, the genotypic comparison yielded significantresults for SNPs in Gene 570 in the US population and in Gene 214, Gene454 and Gene 561 for the UK and combined population (see Tables13A-13C).

The most significant results were obtained for Gene 454, where SNP E 2showed a p-value of 0.007 for the allele test and p-value of 0.01 forthe genotype test in the combined population (49% in control vs. 64% incases). SNP E 2 was also significant in the UK population alone for theallele (p=0.02) and genotype (p=0.04) tests. Two more SNPs reachedstatistical significance in Gene 454 for this sub-phenotype: 1) SNP H 1(p=0.02 in the combined population, 22% in controls vs. 33% in cases;p<0.05 for genotypic test in the UK population); and 2) SNP F −2(genotypic p-value of 0.03 in the combined population).

For Gene 757, SNP A 2 was significant with a p-value of 0.03 in thecombined population (18% in controls vs. 29% in cases).

One SNP in Gene 561 was significant in both the combined population andin the UK population alone (p=0.03 for both the allele and genotypetests in the combined population, 5% in controls vs. not present incases; p<0.05 for allele and p=0.04 for genotype in UK, 6% in controlsvs. not present in cases).

Gene 214 was significant in both the combined and UK populations whencomparing the genotype frequencies between the cases and controls(p=0.01 combined population, p=0.02 UK population).

In the US population, one SNP in Gene 581 reached statisticalsignificance (F +2, p=0.04, 24% in controls vs. not present in cases).The comparison of genotype frequencies also yielded a significant resultfor Gene 570 (SNP F 1, p<0.05).

c. Total IgE:

The analyses were performed using asthmatic children with elevated totalIgE levels, as described in the Linkage Analysis section (Example 3).First, sibling pairs were identified where both sibs were affected andsatisfied this new criteria. Of these pairs, one sib was included in thecase/control analyses if they showed evidence of linkage at the gene ofinterest. This phenotype was more restrictive than the Asthma yes/nocriteria; hence the number of cases included in the analyses was reducedby approximately 41%.

The significance levels (p-values) for allelic association of all typedSNPs to the IgE phenotype are plotted in FIG. 15 (combined population)and FIG. 16 (US and UK populations, separately). Frequencies andp-values for SNPs associated with the IgE phenotype are shown in Tables14A, 14B, and 14C for the combined population and for the UK and USpopulations, separately.

TABLE 14A ASSOCIATION ANALYSIS OF TOTAL IgE PHENOTYPE COMBINED US/UKPOPULATION Combined US and UK FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTLN CASE N P-VALUE P-VALUE 454_B_1 7.1% 183 5.2% 58 0.5295 0.5138 454_E_−125.8% 213 24.6% 69 0.8230 0.9474 454_E_1 0.8% 179 2.2% 67 0.3519 0.3491454_E_2 49.1% 217 58.3% 72 0.0552 0.0364 454_F_−2 34.6% 211 22.5% 710.0089 0.0146 454_G_−1 8.4% 215 10.1% 69 0.4950 0.3130 454_H_1 22.1% 20435.3% 68 0.0030 0.0010 454_H_2 2.0% 198 4.2% 71 0.2146 0.2088 454_K_11.9% 215 2.9% 69 0.4976 0.4937 454_L_−1 6.7% 217 6.0% 67 1.0000 0.9236454_M_1 42.3% 208 29.3% 58 0.0133 0.0399 454_M_2 6.8% 212 6.6% 68 1.00001.0000 454_M_+1 43.2% 212 31.2% 69 0.0127 0.0283 454_O_1 19.1% 215 8.6%64 0.0044 0.0211 454_O_3 17.6% 216 18.4% 68 0.8977 0.1305 454_O_5 3.0%215 1.6% 63 0.5387 0.5327 454_O_6 42.4% 198 31.3% 67 0.0248 0.0468436_A_1 1.2% 203 1.6% 64 0.6751 0.6934 436_C_−1 15.3% 216 13.8% 690.7838 1.0000 436_D_1 5.0% 212 3.6% 69 0.6444 0.8539 436_E_1 46.3% 21439.0% 68 0.1390 0.1920 436_G_1 10.4% 212 10.0% 65 1.0000 0.9441 436_K_−214.9% 204 7.5% 67 0.0266 0.1029 436_K_+1 4.7% 200 3.6% 69 0.8111 1.0000436_L_−1 4.4% 217 5.7% 70 0.4967 0.4872 436_L_1 1.9% 214 1.5% 69 1.00001.0000 515_A_1 43.1% 211 43.6% 63 0.9188 0.5757 515_A_2 37.2% 211 37.1%62 1.0000 0.6857 515_A_3 7.4% 217 7.2% 69 1.0000 1.0000 515_A_4 43.5%208 43.9% 66 1.0000 0.3943 515_A_5 4.1% 207 1.5% 66 0.2720 0.1767515_A_7 2.4% 213 1.5% 67 0.7400 0.7370 570_C_2 9.5% 215 9.8% 51 1.00000.9293 570_C_4 9.5% 217 9.2% 49 1.0000 1.0000 570_F_1 47.9% 215 49.0% 500.9116 0.9785 757_A_2 18.1% 210 25.9% 54 0.0778 0.1407 757_A_4 1.6% 2184.5% 55 0.0723 0.0700 757_A_+4 39.4% 216 41.2% 57 0.7473 0.0453 698_E_16.9% 217 11.2% 58 0.1706 0.2526 698_I_+1 32.5% 209 34.5% 55 0.73260.9308 561_P_1 34.0% 209 40.4% 57 0.2246 0.1509 561_J_1 14.9% 208 10.3%58 0.2286 0.4828 561_H_1 9.0% 212 9.1% 55 1.0000 1.0000 561_E_1 0.0% 1781.0% 48 0.2124 0.2124 561_C_1 0.2% 217 0.9% 57 0.3731 0.3734 561_B_+113.4% 212 14.3% 49 0.8701 0.8764 561_B_1 48.3% 210 44.9% 49 0.57540.7216 561_Y_+1 4.7% 212 1.8% 55 0.2788 0.2683 561_X_−3 31.3% 214 30.2%58 0.9100 0.9259 581_F_+2 24.3% 216 22.1% 61 0.7182 0.7766 722_C_1 33.0%209 30.3% 61 0.6602 0.7945 722_F_+1 1.4% 217 0.8% 61 1.0000 1.0000702_A_−1 7.0% 215 7.5% 60 0.8413 0.5612 702_C_1 49.3% 213 45.1% 610.4721 0.3089 702_D_1 16.1% 217 13.9% 61 0.6723 0.7607 702_F_1 3.7% 2176.6% 61 0.2044 0.2132 702_I_1 18.6% 204 22.3% 56 0.4185 0.3178 702_I_30.7% 217 0.8% 61 1.0000 1.0000 214_B_1 17.8% 214 20.0% 65 0.6044 0.0517214_E_−1 48.8% 202 50.0% 59 0.8347 0.5796 214_E_+1 28.3% 217 38.5% 650.0305 0.0637

TABLE 14B ASSOCIATION ANALYSIS OF TOTAL IgE PHENOTYPE UK POPULATION UKpopulation FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE N P-VALUEP-VALUE 454_B_1 7.3% 109 5.3% 47 0.6272 0.6139 454_E_−1 25.2% 137 27.6%58 0.6154 0.8376 454_E_1 1.0% 104 2.7% 56 0.3478 0.3443 454_E_2 51.4%140 59.8% 61 0.1286 0.1334 454_F_−2 33.2% 137 24.6% 61 0.0983 0.1632454_G_−1 8.3% 138 11.2% 58 0.4433 0.3051 454_H_1 23.0% 135 29.8% 570.1586 0.1000 454_H_2 2.3% 131 3.3% 60 0.5130 0.5089 454_K_1 2.5% 1383.5% 58 0.7384 0.7348 454_L_−1 5.7% 140 6.2% 56 0.8151 0.7942 454_M_142.9% 133 33.7% 49 0.1192 0.2262 454_M_2 5.9% 136 7.0% 57 0.6502 0.8070454_M_+1 44.1% 136 34.5% 58 0.0911 0.0729 454_O_1 19.2% 138 8.5% 530.0125 0.0413 454_O_3 16.9% 139 18.4% 57 0.7693 0.1485 454_O_5 2.5% 1391.9% 53 1.0000 1.0000 454_O_6 44.4% 124 34.8% 56 0.1056 0.0462 436_A_11.8% 135 1.8% 54 1.0000 0.7330 436_C_−1 13.7% 139 14.7% 58 0.8734 0.8810436_D_1 3.7% 136 4.3% 58 0.7771 0.7732 436_E_1 45.3% 137 40.4% 57 0.43190.1736 436_G_1 9.8% 138 10.0% 55 1.0000 0.9280 436_K_−2 13.0% 131 7.0%57 0.1094 0.2734 436_K_+1 2.8% 123 4.3% 58 0.5325 0.5259 436_L_−1 4.6%140 5.9% 59 0.6184 0.6095 436_L_1 1.5% 137 0.9% 58 1.0000 1.0000 515_A_145.9% 135 43.3% 52 0.7281 0.9116 515_A_2 39.6% 135 36.3% 51 0.63340.8401 515_A_3 7.5% 140 8.6% 58 0.6854 0.6732 515_A_4 45.6% 136 43.9% 570.8227 0.9364 515_A_5 3.4% 133 1.7% 57 0.5167 0.5100 515_A_7 1.8% 1381.8% 56 1.0000 1.0000 570_C_2 8.7% 138 10.6% 47 0.5421 0.5764 570_C_48.6% 140 10.0% 45 0.6735 0.5645 570_F_1 49.3% 138 48.9% 46 1.0000 0.6579757_A_2 17.2% 137 26.7% 45 0.0649 0.0752 757_A_4 1.8% 140 4.3% 46 0.23320.2281 757_A_+4 40.3% 139 40.6% 48 1.0000 0.0252 698_E_1 5.4% 140 13.3%49 0.0140 0.0180 698_I_+1 38.4% 133 35.9% 46 0.7094 0.6131 561_P_1 33.0%135 40.6% 48 0.2126 0.2794 561_J_1 13.3% 132 7.1% 49 0.1388 0.3083561_H_1 8.6% 139 6.2% 48 0.5218 0.5022 561_E_1 0.0% 110 0.0% 41 1.00001.0000 561_C_1 0.4% 140 0.0% 48 1.0000 1.0000 561_B_+1 15.0% 137 15.0%40 1.0000 1.0000 561_B_1 52.2% 135 46.3% 40 0.3744 0.5410 561_Y_+1 5.5%136 0.0% 47 0.0148 0.0129 561_X_−3 33.0% 138 32.6% 49 1.0000 0.7487581_F_+2 24.6% 140 21.7% 53 0.5941 0.8802 722_C_1 35.0% 133 29.3% 530.3304 0.4590 722_F_+1 1.8% 139 0.9% 53 1.0000 1.0000 702_A_−1 7.2% 1387.7% 52 0.8294 0.2921 702_C_1 47.8% 136 45.3% 53 0.7309 0.7299 702_D_117.9% 140 14.1% 53 0.4476 0.6600 702_F_1 2.9% 140 5.7% 53 0.2227 0.1546702_I_1 18.3% 131 20.8% 48 0.6485 0.4405 702_I_3 1.1% 140 0.9% 53 1.00001.0000 214_B_1 19.2% 138 21.3% 54 0.6700 0.1252 214_E_−1 47.9% 140 52.1%48 0.4807 0.5675 214_E_+1 30.7% 140 38.9% 54 0.1482 0.2246

TABLE 14C ASSOCIATION ANALYSIS OF TOTAL IgE PHENOTYPE US POPULATION USpopulation FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE N P-VALUEP-VALUE 454_B_1 6.8% 74 4.5% 11 1.0000 1.0000 454_E_−1 27.0% 76 9.1% 110.1094 0.3156 454_E_1 0.7% 75 0.0% 11 1.0000 1.0000 454_E_2 44.8% 7750.0% 11 0.6553 0.5025 454_F_−2 37.2% 74 10.0% 10 0.0214 0.1140 454_G_−18.4% 77 4.5% 11 1.0000 1.0000 454_H_1 20.3% 69 63.6% 11 0.0001 0.0001454_H_2 1.5% 67 9.1% 11 0.0957 0.0932 454_K_1 0.7% 77 0.0% 11 1.00001.0000 454_L_−1 8.4% 77 4.5% 11 1.0000 1.0000 454_M_1 41.3% 75 5.6% 90.0034 0.0204 454_M_2 8.6% 76 4.5% 11 1.0000 1.0000 454_M_+1 41.4% 7613.6% 11 0.0170 0.0132 454_O_1 18.8% 77 9.1% 11 0.3747 0.8371 454_O_318.8% 77 18.2% 11 1.0000 0.8248 454_O_5 4.0% 76 0.0% 10 1.0000 1.0000454_O_6 39.2% 74 13.6% 11 0.0299 0.0138 436_A_1 0.0% 68 0.0% 10 1.00001.0000 436_C_−1 18.2% 77 9.1% 11 0.3766 0.8138 436_D_1 7.2% 76 0.0% 110.3630 0.6475 436_E_1 48.0% 77 31.8% 11 0.1757 0.1728 436_G_1 11.5% 7410.0% 10 1.0000 1.0000 436_K_−2 18.5% 73 10.0% 10 0.5324 1.0000 436_K_+17.8% 77 0.0% 11 0.3665 0.4304 436_L_−1 3.9% 77 4.5% 11 1.0000 1.0000436_L_1 2.6% 77 4.5% 11 0.4913 0.4957 515_A_1 38.2% 76 45.5% 11 0.64090.0155 515_A_2 32.9% 76 40.9% 11 0.4765 0.0234 515_A_3 7.1% 77 0.0% 110.3634 0.3458 515_A_4 39.6% 72 44.4% 9 0.7999 0.0332 515_A_5 5.4% 740.0% 9 0.6008 0.5894 515_A_7 3.3% 75 0.0% 11 1.0000 1.0000 570_C_2 11.0%77 0.0% 4 1.0000 1.0000 570_C_4 11.0% 77 0.0% 4 1.0000 1.0000 570_F_145.5% 77 50.0% 4 1.0000 0.0375 757_A_2 19.9% 73 22.2% 9 0.7616 0.6620757_A_4 1.3% 78 5.6% 9 0.2808 0.2823 757_A_+4 37.7% 77 44.4% 9 0.61400.7660 698_E_1 9.7% 77 0.0% 9 0.3721 0.6785 698_I_+1 22.4% 76 27.8% 90.5652 0.3464 561_P_1 35.8% 74 38.9% 9 0.7997 0.2279 561_J_1 17.8% 7627.8% 9 0.3388 0.3272 561_H_1 9.6% 73 28.6% 7 0.0550 0.0419 561_E_1 0.0%68 7.1% 7 0.0933 0.0933 561_C_1 0.0% 77 5.6% 9 0.1047 0.1047 561_B_+110.7% 75 11.1% 9 1.0000 1.0000 561_B_1 41.3% 75 38.9% 9 1.0000 1.0000561_Y_+1 3.3% 76 12.5% 8 0.1348 0.1312 561_X_−3 28.3% 76 16.7% 9 0.40450.7647 581_F_+2 23.7% 76 25.0% 8 1.0000 0.1246 722_C_1 29.6% 76 37.5% 80.5706 0.6624 722_F_+1 0.6% 78 0.0% 8 1.0000 1.0000 702_A_−1 6.5% 776.2% 8 1.0000 1.0000 702_C_1 52.0% 77 43.8% 8 0.6050 0.4101 702_D_113.0% 77 12.5% 8 1.0000 1.0000 702_F_1 5.2% 77 12.5% 8 0.2393 0.2370702_I_1 19.2% 73 31.3% 8 0.3233 0.3168 702_I_3 0.0% 77 0.0% 8 1.00001.0000 214_B_1 15.1% 76 13.6% 11 1.0000 1.0000 214_E_−1 50.8% 62 40.9%11 0.4894 0.5711 214_E_+1 24.0% 77 36.4% 11 0.2943 0.2456

For the total IgE phenotype, SNPs in Gene 454, Gene 436 and Gene 214showed a significant association in the combined population whencomparing the allele frequencies in the case and control groups. Whenanalyzing the population separately, SNPs in gene 454 were significantin both the UK and US populations, separately, while SNPs in Gene 698and Gene 561 showed a significant association in the UK population.Additional significant results were identified when comparing thegenotype frequencies in the case and control groups. SNPs in Gene 454(US, UK, and combined), Gene 515 (US), Gene 570 (US), Gene 757 (UK andcombined), Gene 698 (UK), and Gene 561 (US, and UK), reached statisticalsignificance.

The most significant results were obtained for Gene 454, where 6 SNPsshowed significant association with the phenotype at the allelic levelin the combined population and in one of the subpopulations. SNP H 1showed highly significant results in the combined and US populations(p=0.003 for the allele test and p=0.001 for the genotype test in thecombined population, 22% in control vs. 35% in cases, p=0.0001 in US forboth tests, 20% in controls vs. 64% in cases). Two other SNPs hadp-values <0.01 in the combined population: 1) SNP F −2 (p=0.009 for theallele test and p=0.01 for the genotype test in the combined population,35% in controls vs. 23% in cases; p=0.02 in US, 37% in controls vs. 10%in cases); and 2) SNP O 1 (p=0.004 for the allele test and p=0.02 forthe genotype test in the combined population, 19% in controls vs. 9% incases; p=0.01 and p=0.04 for the allele and genotype tests respectivelyin UK, 19% in controls vs. 8% in cases). Another SNP in exon O (O 6) hada p-value in the significant range (p=0.02 and p<0.05 for the allele andgenotype tests respectively, in the combined population, 42% of controlsvs. 31% of cases; p=0.03 for the allele test and p=0.01 for the genotypetest in US, 39% of controls vs. 14% of cases; p<0.05 in UK for genotypetest). In addition, two SNPs in high linkage disequilibrium with eachother reached statistical significance in exon M: 1) M 1 (p=0.01 andp=0.04 for the allele and genotype tests, respectively, in the combinedpopulation, 42% in controls vs. 29% in cases; p=0.003 for the alleletest and p=0.02 for the genotype test in US, 41% in controls vs. 6% incases); and 2) M+1 (p=0.01 for the allele test and p=0.03 for thegenotype test in the combined population, 43% in controls vs. 31% incases; p=0.02 and p=0.01 for the allele and genotype tests respectively,in US, 41% of controls vs. 14% of cases).

Gene 436 and Gene 214 both showed a single SNP that reached statisticalsignificance in the combined population only. In Gene 436, which isadjacent to Gene 454, SNP K −2 was significant (p=0.03, 15% in controlsvs. 7% of cases), while in Gene 214, SNP E +1 reached a similar level ofsignificance (p=0.03, 28% in controls vs. 38% of cases).

For Gene 561, SNP Y +1 reached statistical significance in the UKpopulation (p=0.01 for both the allele and genotype tests, 6% incontrols vs. no occurrence in cases) while SNP H 1 showed a significantgenotype test in the US population (p=0.04).

A single SNP in Gene 698 showed a significant association with the totalIgE subphenotype in the UK population (p=0.01 for the allele test andp=0.02 for the genotype test, 5% of controls vs. 13% of cases).

For Gene 757, SNP A +4 showed a significant genotype test in both thecombined and the UK samples (p<0.05 combined, p=0.03 UK).

SNPs in two genes, Gene 515 and Gene 570, had significant genotypep-values in the US population alone (515 A 1, p=0.02; 515 A 2, p=0.02;515 A 4; p=0.03; 570 F 1, p=0.04).

d. Specific IgE:

The analyses were performed using asthmatic children with elevatedspecific IgE levels for at least one allergen, as described in theLinkage Analysis section (Example 3). First, sibling pairs wereidentified where both sibs were affected and satisfied this newcriteria. Of these pairs, one sib was included in the case/controlanalyses if they showed evidence of linkage at the gene of interest.This phenotype was more restrictive than the Asthma yes/no criteria;hence the number of cases included in the analyses was reduced byapproximately 38%.

The significance levels (p-values) for allelic association of the typedSNPs to the specific IgE phenotype are plotted in FIG. 17 (combinedpopulation) and FIG. 18 (US and UK populations, separately). Frequenciesand p-values for SNPs associated with the specific IgE phenotype areshown in Tables 15A, 15B, and 15C for the combined population and forthe UK and US populations, separately.

TABLE 15A ASSOCIATION ANALYSIS OF SPECIFIC IgE PHENOTYPE COMBINED US/UKPOPULATIONS Combined US and UK FREQUENCIES ALLELE GENOTYPE GENE_EXONCNTL N CASE N P-VALUE P-VALUE 454_B_1 7.1% 183 4.4% 57 0.3858 0.3685454_E_−1 25.8% 213 27.5% 69 0.7385 0.8563 454_E_1 0.8% 179 2.2% 690.3551 0.3524 454_E_2 49.1% 217 59.2% 71 0.0422 0.0191 454_F_−2 34.6%211 24.3% 70 0.0279 0.0078 454_G_−1 8.4% 215 8.7% 69 0.8621 0.3087454_H_1 22.1% 204 37.5% 68 0.0006 0.0003 454_H_2 2.0% 198 3.6% 70 0.33920.3340 454_K_1 1.9% 215 3.6% 70 0.3240 0.3186 454_L_−1 6.7% 217 7.4% 680.8457 0.9252 454_M_1 42.3% 208 29.5% 61 0.0115 0.0316 454_M_2 6.8% 2128.0% 69 0.7032 0.8568 454_M_+1 43.2% 212 30.7% 70 0.0097 0.0245 454_O_119.1% 215 11.7% 64 0.0625 0.1300 454_O_3 17.6% 216 19.6% 69 0.61220.3714 454_O_5 3.0% 215 1.5% 65 0.5384 0.5325 454_O_6 42.4% 198 31.3% 670.0248 0.0468 436_A_1 1.2% 203 1.5% 65 0.6786 0.6956 436_C_−1 15.3% 21615.0% 70 1.0000 0.8439 436_D_1 5.0% 212 2.9% 70 0.3531 0.5945 436_E_146.3% 214 40.6% 69 0.2791 0.1703 436_G_1 10.4% 212 11.9% 67 0.63200.6418 436_K_−2 14.9% 204 9.6% 68 0.1478 0.2662 436_K_+1 4.7% 200 2.9%69 0.4675 0.7077 436_L_−1 4.4% 217 4.9% 71 0.8164 0.8123 436_L_1 1.9%214 1.4% 70 1.0000 1.0000 515_A_1 43.1% 211 39.1% 64 0.4746 0.4804515_A_2 37.2% 211 35.2% 64 0.7536 0.7931 515_A_3 7.4% 217 7.1% 70 1.00001.0000 515_A_4 43.5% 208 40.8% 65 0.6126 0.3468 515_A_5 4.1% 207 1.5% 660.2720 0.1767 515_A_7 2.4% 213 0.8% 67 0.4735 0.4692 570_C_2 9.5% 21510.4% 53 0.8545 0.8669 570_C_4 9.5% 217 9.6% 52 1.0000 1.0000 570_F_147.9% 215 48.1% 52 1.0000 0.7607 757_A_2 18.1% 210 26.7% 58 0.04860.1177 757_A_4 1.6% 218 5.1% 59 0.0381 0.0362 757_A_+4 39.4% 216 39.3%61 1.0000 0.2242 698_E_1 6.9% 217 10.7% 61 0.1808 0.2696 698_I_+1 32.5%209 35.6% 59 0.5802 0.7116 561_P_1 34.0% 209 37.3% 59 0.5127 0.1175561_J_1 14.9% 208 12.3% 61 0.5571 0.7970 561_H_1 9.0% 212 13.6% 590.1635 0.1228 561_E_1 0.0% 178 1.0% 50 0.2193 0.2193 561_C_1 0.2% 2170.8% 60 0.3866 0.3869 561_B_+1 13.4% 212 15.7% 51 0.5276 0.6102 561_B_148.3% 210 43.1% 51 0.3773 0.6088 561_Y_+1 4.7% 212 1.7% 59 0.1892 0.1797561_X_−3 31.3% 214 31.1% 61 1.0000 0.8639 581_F_+2 24.3% 216 21.2% 660.4850 0.7321 722_C_1 33.0% 209 30.3% 66 0.5949 0.5842 722_F_+1 1.4% 2171.5% 66 1.0000 1.0000 702_A_−1 7.0% 215 5.4% 65 0.6871 0.8700 702_C_149.3% 213 42.4% 66 0.1948 0.2350 702_D_1 16.1% 217 13.6% 66 0.58350.7346 702_F_1 3.7% 217 6.8% 66 0.1457 0.1282 702_I_1 18.6% 204 18.9% 611.0000 0.9620 702_I_3 0.7% 217 0.8% 66 1.0000 1.0000 214_B_1 17.8% 21421.4% 70 0.3816 0.1069 214_E_−1 48.8% 202 51.5% 65 0.6147 0.4246214_E_+1 28.3% 217 36.4% 70 0.0732 0.1699

TABLE 15B ASSOCIATION ANALYSIS OF SPECIFIC IgE PHENOTYPE UK POPULATIONUK population FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE NP-VALUE P-VALUE 454_B_1 7.3% 109 4.9% 41 0.6057 0.5919 454_E_−1 25.2%137 29.8% 52 0.3638 0.5956 454_E_1 1.0% 104 2.9% 51 0.3357 0.3319454_E_2 51.4% 140 59.4% 53 0.1712 0.0812 454_F_−2 33.2% 137 27.4% 530.3250 0.1607 454_G_−1 8.3% 138 9.6% 52 0.6859 0.4084 454_H_1 23.0% 13532.0% 50 0.0817 0.0497 454_H_2 2.3% 131 2.9% 52 0.7176 0.7151 454_K_12.5% 138 3.9% 52 0.5014 0.4963 454_L_−1 5.7% 140 6.9% 51 0.6346 0.6851454_M_1 42.9% 133 34.8% 46 0.1789 0.2977 454_M_2 5.9% 136 7.8% 51 0.48360.6185 454_M_+1 44.1% 136 35.6% 52 0.1606 0.1308 454_O_1 19.2% 138 12.0%46 0.1520 0.3074 454_O_3 16.9% 139 19.6% 51 0.5458 0.5276 454_O_5 2.5%139 2.1% 48 1.0000 1.0000 454_O_6 44.4% 124 36.0% 50 0.1866 0.0843436_A_1 1.8% 135 2.1% 48 1.0000 0.7112 436_C_−1 13.7% 139 16.3% 520.5150 0.7671 436_D_1 3.7% 136 3.9% 52 1.0000 1.0000 436_E_1 45.3% 13743.1% 51 0.7279 0.1848 436_G_1 9.8% 138 12.0% 50 0.5669 0.7893 436_K_−213.0% 131 9.8% 51 0.4765 0.8412 436_K_+1 2.8% 123 3.9% 51 0.7370 0.7329436_L_−1 4.6% 140 4.7% 53 1.0000 1.0000 436_L_1 1.5% 137 1.0% 52 1.00001.0000 515_A_1 45.9% 135 39.4% 47 0.2805 0.4975 515_A_2 39.6% 135 35.1%47 0.4624 0.5032 515_A_3 7.5% 140 9.6% 52 0.5288 0.5106 515_A_4 45.6%136 41.2% 51 0.4841 0.7514 515_A_5 3.4% 133 2.0% 50 0.7340 0.7298515_A_7 1.8% 138 1.0% 50 1.0000 1.0000 570_C_2 8.7% 138 11.9% 42 0.39600.4214 570_C_4 8.6% 140 11.0% 41 0.5152 0.5345 570_F_1 49.3% 138 46.4%42 0.7088 0.8441 757_A_2 17.2% 137 26.1% 44 0.0869 0.0746 757_A_4 1.8%140 5.6% 45 0.0678 0.0650 757_A_+4 40.3% 139 40.4% 47 1.0000 0.1374698_E_1 5.4% 140 12.8% 47 0.0217 0.0283 698_I_+1 38.4% 133 36.7% 450.8028 0.9428 561_P_1 33.0% 135 38.9% 45 0.3087 0.2299 561_J_1 13.3% 13210.6% 47 0.5897 0.5133 561_H_1 8.6% 139 9.6% 47 0.8341 0.8259 561_E_10.0% 110 0.0% 39 1.0000 1.0000 561_C_1 0.4% 140 0.0% 46 1.0000 1.0000561_B_+1 15.0% 137 16.2% 37 0.8554 0.9302 561_B_1 52.2% 135 43.2% 370.1905 0.3461 561_Y_+1 5.5% 136 0.0% 46 0.0152 0.0133 561_X_−3 33.0% 13831.9% 47 0.8992 0.6051 581_F_+2 24.6% 140 22.1% 52 0.6870 0.9085 722_C_135.0% 133 29.8% 52 0.3918 0.5778 722_F_+1 1.8% 139 1.0% 52 1.0000 1.0000702_A_−1 7.2% 138 5.9% 51 0.8196 0.8126 702_C_1 47.8% 136 42.3% 520.3570 0.6439 702_D_1 17.9% 140 14.4% 52 0.5402 0.6574 702_F_1 2.9% 1406.7% 52 0.1328 0.0644 702_I_1 18.3% 131 17.0% 47 0.8760 1.0000 702_I_31.1% 140 1.0% 52 1.0000 1.0000 214_B_1 19.2% 138 22.1% 52 0.5656 0.1256214_E_−1 47.9% 140 55.3% 47 0.2340 0.2996 214_E_+1 30.7% 140 40.4% 520.0880 0.1988

TABLE 15C ASSOCIATION ANALYSIS OF SPECIFIC IgE PHENOTYPE US POPULATIONUS population FREQUENCIES ALLELE GENOTYPE GENE_EXON CNTL N CASE NP-VALUE P-VALUE 454_B_1 6.8% 74 3.1% 16 0.6918 0.6812 454_E_−1 27.0% 7620.6% 17 0.5205 0.6956 454_E_1 0.7% 75 0.0% 18 1.0000 1.0000 454_E_244.8% 77 58.3% 18 0.1940 0.3576 454_F_−2 37.2% 74 14.7% 17 0.0144 0.0720454_G_−1 8.4% 77 5.9% 17 1.0000 1.0000 454_H_1 20.3% 69 52.8% 18 0.00020.0006 454_H_2 1.5% 67 5.6% 18 0.1974 0.1956 454_K_1 0.7% 77 2.8% 180.3439 0.3447 454_L_−1 8.4% 77 8.8% 17 1.0000 0.7920 454_M_1 41.3% 7513.3% 15 0.0034 0.0214 454_M_2 8.6% 76 8.3% 18 1.0000 0.8025 454_M_+141.4% 76 16.7% 18 0.0066 0.0220 454_O_1 18.8% 77 11.1% 18 0.3355 0.7150454_O_3 18.8% 77 19.4% 18 1.0000 0.7780 454_O_5 4.0% 76 0.0% 17 0.59420.5880 454_O_6 39.2% 74 17.6% 17 0.0177 0.0437 436_A_1 0.0% 68 0.0% 171.0000 1.0000 436_C_−1 18.2% 77 11.1% 18 0.4578 0.8790 436_D_1 7.2% 760.0% 18 0.1275 0.3513 436_E_1 48.0% 77 33.3% 18 0.1373 0.2901 436_G_111.5% 74 11.8% 17 1.0000 0.8232 436_K_−2 18.5% 73 8.8% 17 0.2097 0.7103436_K_+1 7.8% 77 0.0% 18 0.1272 0.3516 436_L_−1 3.9% 77 5.6% 18 0.64780.6439 436_L_1 2.6% 77 2.8% 18 1.0000 1.0000 515_A_1 38.2% 76 38.2% 171.0000 0.0236 515_A_2 32.9% 76 35.3% 17 0.8414 0.0275 515_A_3 7.1% 770.0% 18 0.1286 0.1169 515_A_4 39.6% 72 39.3% 14 1.0000 0.0307 515_A_55.4% 74 0.0% 16 0.3538 0.3418 515_A_7 3.3% 75 0.0% 17 0.5860 0.5798570_C_2 11.0% 77 4.5% 11 0.7043 0.7580 570_C_4 11.0% 77 4.5% 11 0.70431.0000 570_F_1 45.5% 77 55.0% 10 0.4796 0.1204 757_A_2 19.9% 73 28.6% 140.3178 0.1027 757_A_4 1.3% 78 3.6% 14 0.3924 0.3942 757_A_+4 37.7% 7735.7% 14 1.0000 1.0000 698_E_1 9.7% 77 3.6% 14 0.4729 0.7729 698_I_+122.4% 76 32.1% 14 0.3337 0.2961 561_P_1 35.8% 74 32.1% 14 0.8304 0.2910561_J_1 17.8% 76 17.9% 14 1.0000 0.8292 561_H_1 9.6% 73 29.2% 12 0.01420.0197 561_E_1 0.0% 68 4.5% 11 0.1392 0.1392 561_C_1 0.0% 77 3.6% 140.1538 0.1538 561_B_+1 10.7% 75 14.3% 14 0.5260 0.5082 561_B_1 41.3% 7542.9% 14 1.0000 1.0000 561_Y_+1 3.3% 76 7.7% 13 0.2714 0.2702 561_X_−328.3% 76 28.6% 14 1.0000 1.0000 581_F_+2 23.7% 76 17.9% 14 0.6274 0.3029722_C_1 29.6% 76 32.1% 14 0.8239 0.7629 722_F_+1 0.6% 78 3.6% 14 0.28190.2826 702_A_−1 6.5% 77 3.6% 14 1.0000 1.0000 702_C_1 52.0% 77 42.9% 140.4162 0.0825 702_D_1 13.0% 77 10.7% 14 1.0000 1.0000 702_F_1 5.2% 777.1% 14 0.6533 0.6486 702_I_1 19.2% 73 25.0% 14 0.4519 0.6731 702_I_30.0% 77 0.0% 14 1.0000 1.0000 214_B_1 15.1% 76 19.4% 18 0.6124 0.5757214_E_−1 50.8% 62 41.7% 18 0.3509 0.4623 214_E_+1 24.0% 77 25.0% 181.0000 1.0000

For the specific IgE subphenotype, SNPs in Gene 454 and Gene 757 showeda significant association in the combined population when comparing theallele frequencies in the case and control groups. When analyzing thepopulations separately, SNPs in Gene 561 showed a significantassociation in both the US and UK populations. In addition, five SNPs inGene 454 showed association with the subphenotype in the US population.Gene 698 contained a SNP reaching statistical significance in the UKpopulation only. Additional significant results were identified whencomparing the genotype frequencies in the case and control groups. SNPsin Gene 515, Gene 561, and Gene 454 reached statistical significance inthe US population. SNPs in Gene 454, Gene 561, and Gene 698 weresignificant in the UK and in the combined population. In addition, a SNPin gene 757 was significant at the 0.05 level in the combinedpopulation.

The most significant results were found in Gene 454, where 6 SNPs yieldsignificant association with the subphenotype in the combinedpopulation. SNP H 1 showed highly significant results in the combinedand US populations (p=0.0006 and 0.0003 for the allele and genotypetests respectively in the combined population, 22% in control vs. 38% incases; p=0.0002 for the allele test and p=0.0006 for the genotype testin the US population, 20% in controls vs. 53% in cases; genotypic testp<0.05 in the UK population). Two SNPs in exon M gave significantresults: 1) M 1 (p=0.01 and p=0.03 for the allele and genotype tests inthe combined population, 42% in controls vs. 30% in cases; p=0.003 forthe allele and p=0.02 for the genotype test in US, 41% in controls vs.13% in cases); and 2) M+1 (p=0.01 and p=0.02 for the allele and genotypetests respectively, in the combined population, 43% in controls vs. 31%in cases; p=0.007 for the allele and p=0.02 for the genotype test in US,41% of controls vs. 17% of cases). Three other SNPs had p-values <0.05in the combined population: 1) SNP E 2 (p=0.04 and p=0.02 for the alleleand genotype tests respectively, in the combined population, 49% incontrols vs. 59% in cases); 2) SNP F −2 (p=0.03 and p=0.008 for theallele and genotype tests respectively, in the combined population, 35%in controls vs. 24% in cases; p=0.01 for the allele test in US, 37% incontrols vs. 15% in cases); and 3) SNP O 6 (p=0.02 and p<0.05 for theallele and genotype tests respectively, in the combined population, 42%in controls vs. 31% in cases; p=0.02 for the allele test and p=0.04 forthe genotype test in US, 39% in controls vs. 18% in cases).

For Gene 561, SNP Y +1 reached statistical significance in the UKpopulation (p=0.02 and p=0.01 for the allele and genotype testsrespectively, 6% in controls vs. no occurrence in cases) while SNP H 1had a significant p-value in the US population (p=0.01 and p=0.02 forthe allele and genotype tests respectively, 10% in controls vs. 29% incases).

A single SNP in Gene 698 showed a significant association with thespecific IgE subphenotype in the UK population (p=0.02 and p=0.03 forthe allele and genotype tests respectively, 5% of controls vs. 13% ofcases).

For Gene 757, SNPs A 2 and A 4 showed a significant association with thesubphenotype in the combined population (A 2 p<0.05, 18% in controls vs.27% in cases; A 4 p=0.04 for both the allele and genotype tests, 2% incontrols vs. 5% in cases).

Additionally, three SNPs in Gene 515 had significant genotype p-valuesin the US population alone (A 1, p=0.02; A 2, p=0.03; A 4; p=0.03).

In summary, evidence obtained from association studies implicatedseveral genes in the 12q23-ter region as being involved in respiratorydiseases. This was supported by analysis of the asthma (yes/no)phenotype, BHR phenotype, total IgE phenotype, and specific IgEphenotype in asthmatic individuals. Thus, chromosome 12q23-terencompassed genes involved in asthma and related diseases thereof.

Example 12 Haplotype Analyses

In addition to the analysis of individual SNPs, haplotype frequenciesbetween the case and control groups were also compared. The haplotypeswere constructed using a maximum likelihood approach. Since existingsoftware for predicting haplotypes was unable to utilize individualswith missing data, a program was developed to analyze all individuals.This provided more accurate haplotype frequency estimates. Haplotypeanalysis based on multiple SNPs in a gene was expected to provideincreased evidence for an association between a given phenotype and thatgene if all haplotyped SNPs were involved in the manifestation of thephenotype. In other words, allelic variation involving the haplotypedSNPs was expected to be associated with different risks of orsusceptibilities to the phenotype.

The estimated frequency of each haplotype was compared between cases andcontrols by a permutation test. An overall comparison of thedistribution of all haplotypes between the two groups was alsoperformed. For each gene with two SNPs or more, all 2-at-a-timehaplotypes were constructed, and their frequencies were compared betweenthe case and control groups. P-values for the overall comparisons wereplotted against a coordinate system based on genomic sequence (averagelocation of the two SNPs in the haplotype). This was used to visualizeregions where haplotype association was present. A small p-value (or alarge value of −log(p) as plotted in the figures described below) wasindicative of an association between the haplotyped SNPs and the diseasephenotype. The analysis was repeated for the US and UK population,separately, to adjust for the possibility of genetic heterogeneity.

1. Asthma Phenotype:

FIG. 19 (combined population) and FIG. 20 (US and UK populationsseparately) shows the results for the haplotype analysis (2-at-a-time)for all SNPs in Gene 214, Gene 436, Gene 454, Gene 515, Gene 561, Gene570, Gene 698, Gene 702, Gene 722, and Gene 757.

The most significant associated haplotype was formed by SNPs E −1 and E+1 from Gene 214, which had a p-value of 0.00001 in the combinedpopulation (p=0.00002 in UK, non-significant in US). This SNPcombination was much more significant than the analysis of these SNPsalone (combined population p=0.04 for E +1 and p=0.93 for E −1).Eighteen SNP combinations had p-values <0.01 in gene 454 in the combinedpopulation, with the most significant haplotype consisting of SNP E 2and F −2. This haplotype had a p-value of 0.001 in the combinedpopulation (p=0.008 in the US, p<0.05 in UK). Although this result wasmore significant than the analysis of these SNPs alone, the levels ofsignificance found in the haplotypes of Gene 454 were comparable to thesignificance obtained from the analysis of the SNPs alone (in thecombined population: E 2 and M +1, p=0.003; G −1 and M +1, p=0.004; E −1and E 2, p=0.004; E 2 and H 1, p=0.004; E 2 and O 6, p=0.004; E 2 and M1, p=0.004; H 1 and O 3, p=0.005; E 1 and E 2, p=0.005; B 1 and H 1,p=0.006; E 1 and H 1, p=0.006; E 1 and M +1, p=0.007; E 1 and F −2,p=0.007; B 1 and E 2, p=0.007; G −1 and H 1, p=0.008; H 1 and O 1,p=0.008; F −2 and M 1, p=0.009; E 2 and G −1, p=0.01).

In Gene 561, a single haplotype (J 1 and H 1) reached statisticalsignificance at the 0.01 level in the combined population (p=0.008),while all seven haplotype combinations with SNP Y +1 yield significantresults at the 0.01 level in the UK population (P 1 and Y +1, p=0.0006;C 1 and Y +1, p=0.0007; E 1 and Y +1, p=0.0008; J 1 and Y +1, p=0.001; H1 and Y +1, p=0.001; B +1 and Y +1, p=0.002; B 1 and Y +1, p=0.002; Y +1and X −3, p=0.004). The SNP combination of H 1 and E 1 had a significantassociation in the US population (p=0.009). In addition, in the combinedpopulation, the haplotypes formed by SNPs A2 and A +4 in gene 757 weremore significantly associated with the disease (p=0.004) than any ofthese SNPs alone (p=0.03 for A 2, p=0.60 for A +4).

2. Bronchial Hyper-Responsiveness:

A similar test for association of 2-SNP-at-a-time haplotypes with BHR(PC₂₀ 16 mg/ml) was performed. In FIGS. 21 and 22, the haplotypeanalysis (2-at-a-time) for all SNPs in Gene 214, Gene 436, Gene 454,Gene 515, Gene 561, Gene 570, Gene 698, Gene 702, Gene 722, and Gene 757is shown for the combined population, and for the UK and the USpopulations, respectively.

The most significant associated haplotype was formed by SNPs E −1 and E+1 from Gene 214, which had a p-value of 0.00007 in the combinedpopulation (p=0.0002 in UK, non-significant in US). Four SNPcombinations had p-values <0.01 in Gene 454 in the combined population,(E −1 and E 2, p=0.004; E 1 and E 2, p=0.004; E 2 and G −1, p=0.005; E 2and F −2, p=0.008), and one SNP combination in the UK (E 1 and E 2,p=0.007). In Gene 561, four haplotypes reached statistical significanceat the 0.01 level in the combined population (J 1 and H 1, p=0.003; E 1and Y +1, p=0.003; J 1 and E 1, p=0.006; J 1 and Y +1, p=0.009), one inthe UK population (J 1 and Y +1, p=0.01), and one in the US (H 1 and E1, p=0.002). In addition, in the combined population, a haplotype formedby SNPs in Gene 757 (A2 and A +4, p=0.003) was significant at the 0.01level.

3. Total IgE:

A similar test for association of 2-SNP-at-a-time haplotypes withelevated levels of total IgE was performed. In FIGS. 23 and 24, thehaplotype analysis (2-at-a-time) for all SNPs in Gene 214, Gene 436,Gene 454, Gene 515, Gene 561, Gene 570, Gene 698, Gene 702, Gene 722,and Gene 757 is shown for the combined and for the UK and the USpopulations, respectively.

The most significant associated haplotype was formed by SNPs E −1 and E+1 from Gene 214, with a p-value of 0.000003 in the combined population(p=0.000005 in UK, non-significant in US). Thirteen SNP combinations hadp-values <0.01 in gene 454 in the combined population (K 1 and O 1,p=0.002; H 1 and O 1, p=0.002; O 1 and O 3, p=0.004; E 1 and O 1,p=0.005; G −1 and H 1, p=0.006; H 1 and O 3, p=0.007; F −2 and M 1,p=0.008; H 2 and O 1, p=0.008; B 1 and O 1, p=0.009; M 1 and O 1,p=0.009; G −1 and M +1, p=0.009; E 2 and O 1, p=0.009; F −2 and H 1,p=0.01), one SNP combination in the UK (K 1 and O 1, p=0.007), andtwenty-nine SNP combinations in the US (H 1 and K 1, p=0.0001; E 1 and H1, p=0.0002; H 1 and H 2, p=0.0002; E 2 and H 1, p=0.0003; B 1 and H 1,p=0.0003; H 1 and M 1, p=0.0003; G −1 and H 1, p=0.0004; H 1 and O 5,p=0.0004; H 1 and O 3, p=0.0004; H 1 and O 6, p=0.0005; E −1 and H 1,p=0.0006; F −2 and H 1, p=0.0006; H 1 and M +1, p=0.0006; H 1 and O 1,p=0.0007; H 1 and L −1, p=0.0008; H 1 and M 2, p=0.001; M 1 and O 3,p=0.001; M 1 and O 5, p=0.002; H 2 and M 1, p=0.002; E 1 and M 1,p=0.002; K 1 and M 1, p=0.002; F −2 and M 1, p=0.004; L −1 and M 1,p=0.005; M 1 and M 2, p=0.005; B 1 and M 1, p=0.005; F −2 and G −1,p=0.005; M 1 and O 1, p=0.006; E 2 and M 1, p=0.006; M 1 and M +1,p=0.007). In Gene 561, three haplotypes reached statistical significanceat the 0.01 level in the UK sample (E 1 and Y +1, p=0.008; C 1 and Y +1,p=0.008; H 1 and Y +1, p=0.009), and two reached statisticalsignificance in the US sample (E 1 and C 1, p=0.004; E 1 and Y +1,p=0.004). In Gene 757, the haplotype formed with SNP A2 and SNP A +4 wassignificant at the 0.01 level in the combined population (p=0.002), andin the UK population (p=0.006).

4. Specific IgE:

A similar test for association of 2-SNP-at-a-time haplotypes withelevated levels of specific IgE was performed. In FIGS. 25 and 26, thehaplotype analysis (2-at-a-time) for all SNPs in Genes 214, 436, 454,515, 561, 570, 698, 702, 722 and 757 is shown for the combined and forthe UK and the US populations, respectively.

The most significant associated haplotype was formed by SNPs E −1 and E+1 from Gene 214, with a p-value of 0.000006 in the combined population(p=0.000003 in UK, non-significant in US). Sixteen SNP combinations hadp-values <0.01 in gene 454 in the combined population (H 1 and O 3,p=0.0007; H 1 and K 1, p=0.002; G −1 and H 1, p=0.002; H 1 and O 1,p=0.003; E 2 and H 1, p=0.003; H 1 and H 2, p=0.003; E 1 and H 1,p=0.003; B 1 and H 1, p=0.003; H 1 and M 2, p=0.003; H 1 and O 5,p=0.004; H 1 and M +1, p=0.004; H 1 and L −1, p=0.004; F −2 and H 1,p=0.004; H 1 and M 1, p=0.005; E −1 and H 1, p=0.006; H 1 and O 6,p=0.007), and thirty-three SNP combinations in the US (H 1 and M 1,p=0.0005; E 1 and H 1, p=0.0006; H 1 and O 5, p=0.0007; H 1 and O 6,p=0.0008; H 1 and M +1, p=0.0009; H 1 and K 1, p=0.001; F −2 and H 1,p=0.001; K 1 and M 1, p=0.001; H 1 and H 2, p=0.001; M 1 and O 3,p=0.001; G −1 and H 1, p=0.001; H 1 and O 3, p=0.001; B 1 and H 1,p=0.001; H 1 and L −1, p=0.002; E −1 and H 1, p=0.002; E 2 and H 1,p=0.002; H 1 and M 2, p=0.002; H 1 and O 1, p=0.002; M 1 and M +1,p=0.002; K 1 and M +1, p=0.003; M 1 and O 5, p=0.003; M +1 and O 5,p=0.004; E 2 and M 1, p=0.005; F −2 and K 1, p=0.005; E 1 and M 1,p=0.005; K 1 and O 6, p=0.006; M +1 and O 6, p=0.006; H 2 and M 1,p=0.007; F −2 and O 3, p=0.008; M +1 and O 3, p=0.008; E 2 and M +1,p=0.009; M 1 and O 1, p=0.009; O 5 and O 6, p=0.009). In Gene 561, twohaplotypes reached statistical significance at the 0.01 level in the UKpopulation (E 1 and Y +1, p=0.003; C 1 and Y +1, p=0.007), and tworeached statistical significance in the US population (E 1 and C 1,p=0.007; H 1 and E 1, p=0.009). In Gene 757, the haplotype formed withSNP A2 and SNP A +4 was significant at the 0.01 level in the combinedpopulation (p=0.002) and in the UK sample (p=0.006).

In summary, haplotype analysis of the SNPs provided additional evidencedemonstrating the presence of asthma susceptibility genes on chromosome12. In some SNP combinations, the level of significance of theassociation was increased by an order of magnitude.

Example 13 Transmission Disequilibrium Test (TDT)

A family based test of association, the transmission disequilibrium test(TDT), was conducted for Gene 454. By selecting a single affectedoffspring in each family, the TDT test performed a test of association(due to linkage disequilibrium) in the presence of linkage. The testdetermined whether a particular allele or genotype was preferentiallytransmitted to an affected individual over what would be expected bychance. Only heterozygote parents were considered informative for theTDT. In addition, to increase power, heterozygote parents transmitting adifferent allele to two affected offspring were ignored. Accordingly,the TDT would be based on the same families that contributed to thelinkage signal. The significance levels were estimated by Markov ChainMonte Carlo simulation methods as implemented in TDTEX from the S.A.G.E.program (1997, Department of Epidemiology and Biostatistics, RammelkampCenter for Education and Research, MetroHealth Campus, Case WesternReserve University, Cleveland, Ohio). As only heterozygote parentscontributed information to the TDT test, SNP haplotypes (all 2-at-a-timeand all 3-at-a-time) were also constructed based on family data with theprogram GENEHUNTER (Kruglyak et al., 1996). This served to increase theinformativeness of the single SNPs. These haplotypes were then used as“alleles” in future TDT analyses. In addition, p-values obtained fromthe TDT analyses were compared to the p-values obtained from thehaplotyping in the case/control setting. To check for consistency,p-values, associated with testing frequencies in cases and controls,were examined when selecting the overtransmitted alleles or genotypesidentified in the TDT test.

1. Asthma Phenotype:

Three candidate SNPs for Gene 454 were typed in the extended populationin order to investigate further the association seen in the case-controlstudy. All three SNPs result in amino acid changes (E 2, histidine totyrosine (C→T); H 1 and H 2, arginine to histidine (G→A)). Results areshown in Table 16. Column 1 lists the exon(s) containing the SNP(s) ofinterest. Column 2 lists the overtransmitted alleles or genotypes.Column 3 lists the TDT p-values. Columns 4, 5, and 6 list the p-values,the frequencies in the cases, and the frequencies in the controls of theovertransmitted alleles or genotypes, respectively.

Since the TDT was not influenced by admixture, it was performed usingthe combined US and UK populations. For SNPs E 2 and H 1, the genotypeformed by the CA/CA haplotypes was significantly overtransmitted to theaffected individuals (p=0.04). In addition, this genotype was found inonly 2% of the controls while 12% of the cases harbor this genotype.This difference was highly significant (p=0.0002). For the SNPcombination comprising H 1 and H 2, the AG/AG genotypes wereovertransmitted to affected individuals. This result approached thestatistical level of 0.05 (p=0.06). Moreover, this genotype was morefrequent in the cases (14%) compared to the controls (2%), and thisdifference was highly significant (p=0.00005). The TDT results supportedthe association previously observed in the case-control studies for Gene454. The results also pointed to a recessive mechanism of transmission,as the genotype test showed the strongest evidence of association.

TABLE 16 TDT ANALYSIS OF ASTHMA PHENOTYPE Asthma Yes/No Combined US andUK TDT Case/Control Control Case Exon p-value p-value Freq FreqOver-Transmitted Allele 454_E_2 T 1.0000 0.0058 50.9% 39.7% 454_H_1 A0.3484 0.0032 22.1% 33.0% 454_H_2 G 0.1094 0.7801 98.0% 97.4%454_E_2_H_1 CA 0.3874 0.0008 15.4% 27.6% 454_E_2_H_2 CG 0.4612 0.009748.4% 59.4% 454_H_1_H_2 AG 0.0900 0.0036 20.0% 30.4% 454_E_2_H_1_H_2 CAG0.2167 0.0015 15.3% 26.8% Over-Transmitted Genotype 454_E_2 TT 0.83750.0070 28.6% 13.7% 454_H_1 AA 0.1057 0.0022 4.9% 17.0% 454_H_2 GG 0.11070.7776 96.0% 94.8% 454_E_2_H_1 CA/CA 0.0359 0.0002 1.5% 11.6%454_E_2_H_2 CG/CG 0.2829 0.2477 26.8% 33.0% 454_E_2_H_2 TG/TG 0.28290.0038 26.3% 12.2% 454_H_1_H_2 AG/AG 0.0637 0.00005 2.0% 14.3%454_E_2_H_1_H_2 CAG/CAG 0.0877 0.0001 1.0% 10.7%

2. Bronchial Hyper-Responsiveness:

The TDT analyses were repeated using only the asthmatic pairs thatsatisfied the additional criteria of having a PC₂₀≦16 mg/ml (Table 17).As for the case of the asthma yes/no phenotype, significance was reachedwith the genotypic TDT test. For this subphenotype, genotype AA of SNP H1 was overtransmitted to affected individuals (p=0.04). This genotypewas also present more often in the cases than in the controls (17%cases, 5% controls, p=0.02). Two haplotype combinations hadovertransmitted genotypes that approached statistically significantlevels: genotype CA/CA for SNPs E 2 and H 1 (p=0.06) and genotypeCAG/CAG for SNPs E2, H1 and H2 (p=0.06). Both of these genotypes werefound more often in the cases (CA/CA 13%, CAG/CAG 1%) than in thecontrols (CA/CA 2%, CAG/CAG 1%), and these differences were highlysignificant (p=0.0008 for CA/CA, p=0.0014 for CAG/CAG).

TABLE 17 TDT ANALYSIS OF BHR PHENOTYPE Combined US and UK TDTCase/Control Control Case Exon p-value p-value Freq FreqOver-Transmitted Allele 454_E_2 C 1.0000 0.0074 49.1% 63.6% 454_H_1 A0.7974 0.0223 22.1% 33.0% 454_H_2 G 0.6252 0.2962 98.0% 96.3%454_E_2_H_1 CA 0.7986 0.0037 15.4% 28.8% 454_E_2_H_2 CG 0.7156 0.027448.4% 61.4% 454_H_1_H_2 AG 0.5338 0.0442 20.0% 29.3% 454_E_2_H_1_H_2 CAG0.9090 0.0192 15.3% 26.4% Over-Transmitted Genotype 454_E_2 TT 0.79170.0140 28.6% 10.9% 454_H_1 AA 0.0429 0.0181 4.9% 17.0% 454_H_2 GG 0.62350.2923 96.0% 92.6% 454_E_2_H_1 CA/CA 0.0601 0.0008 1.5% 13.2%454_E_2_H_2 CG/CG 0.7211 0.2373 26.8% 35.2% 454_H_1_H_2 AG/AG 0.13190.0022 2.0% 13.2% 454_E_2_H_1_H_2 CAG/CAG 0.0632 0.0014 1.0% 11.3%

3. Total IgE

The TDT analyses were also performed using the phenotype previousdescribed for total IgE (Table 18). Again, significance was reached withthe genotypic TDT test. For this subphenotype, genotype AA of SNP H 1was overtransmitted to affected individuals (p=0.03). This genotype wasalso present more often in the cases than in the controls (21% cases, 5%controls, p=0.0001). Two genotypes for the SNP combination formed by E2and H1 had statistically significant overtransmission: genotype CA/CAand genotype CA/TA (p<0.05). Both genotypes were found more often in thecases (CA/CA 12%, CA/TA 9%) than in the controls (CA/CA 2%, CA/TA 3%),and these differences were significant (p=0.0009 for CA/CA, p=0.03 forCA/CT).

TABLE 18 TDT ANALYSIS OF TOTAL IgE PHENOTYPE Combined US and UK TDTCase/Control Control Case Exon p-value p-value Freq FreqOver-Transmitted Allele 454_E_2 C 1.0000 0.0552 49.1% 58.3% 454_H_1 A0.0821 0.0030 22.1% 35.3% 454_H_2 G 0.2896 0.2146 98.0% 95.8%454_E_2_H_1 CA 0.5439 0.0040 15.4% 27.3% 454_E_2_H_2 CG 0.6807 0.105548.4% 56.8% 454_H_1_H_2 AG 0.3460 0.0116 20.0% 30.9% 454_E_2_H_1_H_2 CAG0.3447 0.0101 15.3% 25.8% Over-Transmitted Genotype 454_H_1 AA 0.03490.0001 4.9% 20.6% 454_H_2 GG 0.2888 0.2088 96.0% 91.6% 454_E_2_H_1 CA,CA 0.0477 0.0009 1.5% 11.8% 454_E_2_H_1 CA, TA 0.0477 0.0314 2.5% 8.8%454_E_2_H_2 CG, CG 0.7049 0.8766 26.8% 28.2% 454_H_1_H_2 AG, AG 0.17070.00009 2.0% 16.2% 454_E_2_H_1_H_2 CAG, CAG 0.1457 0.0013 1.0% 10.3%

4. Specific IgE:

The TDT analyses were performed using the phenotype previous describedfor specific IgE (Table 19). There were no alleles or genotypes thatwere significantly overtransmitted at the 0.05 level. However, the testfor the overtransmission of genotype AA SNP H 1 had a p-value <0.1. Thisgenotype was present more often in the cases than in the controls (22%cases, 5% controls, p=0.0003).

TABLE 19 TDT ANALYSIS OF SPECIFIC IgE PHENOTYPE Combined US and UK TDTCase/Control Control Case Exon p-value p-value Freq FreqOver-Transmitted Allele 454_H_1 A 0.1555 0.00006 22.1% 37.5% 454_H_2 G0.3757 0.3392 98.0% 96.4% 454_E_2_H_1 CA 0.7101 0.0006 15.4% 30.2%454_E_2_H_2 TG 0.8317 0.0332 49.5% 38.6% 454_H_1_H_2 AG 0.1369 0.001220.0% 33.9% 454_E_2_H_1_H_2 CAG 0.6602 0.0012 15.3% 29.3%Over-Transmitted Genotype 454_H_1 AA 0.0910 0.00003 4.9% 22.1% 454_H_2GG 0.3740 0.3340 96.0% 92.9% 454_E_2_H_1 CA/TA 0.2586 0.0314 2.5% 8.8%454_E_2_H_2 TG/TG 0.7369 0.0118 26.3% 11.4% 454_H_1_H_2 AG/AG 0.11040.00003 2.0% 17.7% 454_E_2_H_1_H_2 CAG/CAG 0.3841 0.0004 1.0% 11.8%454_E_2_H_1_H_2 CGA/CGA 0.3841 1.0000 0.0% 0.0%

Example 14 Gene Analysis and Potential Function 1. Functional Role ofGene 454 in Asthma and Related Diseases

Extracellular ATP triggers a variety of responses in several cell types,including contraction of smooth muscles, regulation of nitric oxideproduction from endothelium, stimulation of cytokine release from immunecells, and modulation of several other metabolic pathways. The receptorsthat mediate these diverse effects are the P2 purinoreceptors, which aredivided into two subgroups: P2Y and P2X receptors. The P2X receptors area family of multimeric ligand-gated ion channels activated solely byextracellular ATP and structurally distinct from other ligand-gatedchannels. Gene 454 represents the seventh member of the P2X receptorfamily, P2X7. The nucleic acid sequence of Gene 454 corresponds to SEQID NO:19, and the encoded amino acid sequence corresponds to SEQ IDNO:111, as disclosed herein (see FIGS. 7A-7H). The Gene 454 transcriptis 5.087 Kb, the gene is ˜55 Kb in size, and includes 13 exons. The Gene454 ORF is 1788 bp long and encodes a 596 amino acid protein. The 5′ and3′ untranslated regions are 69 bp and 3230 bp in length, respectively.As determined by the experiments described herein, Gene 454 is expressedin brain, heart, skeletal muscle, spleen, kidney, liver, placenta, lung,leukocytes, lymph and fetal liver tissues (FIG. 6).

Data have indicated that the P2X7 receptor is involved in cell death,cytokine release, and the shedding of surface antigens. The P2X7receptor also mediates activation of the transcription factors NF-K-betaand NFAT. The P2X7 receptor displays unique permeability properties. Atlow ATP concentrations P2X7 forms small ATP-gated cation channels,allowing the influx of small cations, including Ca²⁺, into theintracellular environment. Notably, in rat peritoneal mast cells, thereis a direct correlation between the influx of Ca²⁺ and the release ofhistamine as a consequence of ATP levels (Schulman et al., 1999, Am. J.Respir. Cell Mol. Biol. 20:530-537). In addition, at these levels ofATP, various proteases are activated including membrane metalloproteasesand intracellular caspases (Gu et al., 1998, Blood 92:946-951).

At high ATP concentrations, the P2X7 receptor pore size increasesallowing the passage of anions as well as cations up to 900 daltons insize (Nihei et al., 2000, Mem. Inst. Oswaldo Cruz 95:415-428).Interestingly, inhalation of aerosolized ATP has been shown to triggerbronchoconstriction in healthy and asthmatic individuals. In asthmatics,ATP was 50 times more potent than methacholine, and 87-fold more potentthan histamine, in producing a 15% decrease in FEV₁. (Schulman et al.,1999, Am. J. Respir. Cell Mol. Biol. 20:530-537). This suggests thatextracellular ATP acts as an important modulator of pro-inflammatoryregulation via the P2X7 receptor.

The P2X7 protein contains two transmembrane domains connected by a largeextracellular loop, and intracellular N-terminal and C-terminal domains(FIG. 10). P2X7 shares significant amino acid identity with the othermembers of the P2X receptor family (30-40%), except in the C-terminusdomain, which is 240 amino acids long. P2X7 contains a long uniquecarboxyl terminus, which appears to be involved in the permeabilityproperties of the P2X7 receptor. Truncation of the cytoplasmic tailabolishes ATP-induced uptake of the fluorescent dye YoPro-1 and ethidiumbromide (Gu et al., 2001, JBC 276:11135-11142). Further, a SNP (A→C) inthe cytoplasmic tail was identified in the Caucasian population. The SNPresults in a glutamic acid to alanine change at amino acid 496. Thisamino acid substitution results in a loss of functional P2X7 inhomozygotes, and results a 50% loss of function in heterozygotes (Gu etal., 2001, JBC 276:11135-11142). The expression of P2X7 has beenobserved mainly in cells of the immune and hematopoietic system, andP2X7 has been shown to mediate the ATP-induced apoptotic death inmonocytes, macrophages, and lymphocytes. However, P2X7 expression hasbeen observed in other cell-types at lower levels. In particular,fibroblasts express P2X7, and are responsive to ATP.

Fibroblasts are non-excitable cells that play a role in the modulationof a variety of microenvironmental situations to which these cells areexposed. In the lung, fibroblasts lie in the lamina propria under thebasement membrane. The bronchial epithelium lies above the basementmembrane, and is attached thereto. In accordance with one model ofrespiratory diseases, allergens cause the cells of the bronchialepithelium to release their cytoplasmic contents. The cellular ATPconcentration of each cell is estimated to be 5-10 mM. The released ATPimmediately passes through the basement membrane by passive diffusion.This triggers the P2X7 receptors on the surface of the fibroblast cellsto dilate, forming an open channel. The P2X7 receptors allow the influxof cations and anions up to 900 daltons. One of these ions triggers asignal transduction cascade that induces the final step in thepost-translational processing of pro-IL-1β, a multipotentialinflammatory mediator (Solle et al., JBC 276:125-132). The mature IL-1βbinds to receptors on target cells that elicit signaling cascades. Thisleads to the up-regulation of gene products such as matrixmetalloproteases, cyclooxygenase-2, IL-6 and cellular adhesionmolecules, which contribute to inflammation.

IL-6 is an important pro-inflammatory cytokine that is secreted bymononuclear phagocytes, antigen-presenting cells, and fibroblasts. Inaccordance with the current knowledge in the art, secretion of IL-6creates a pro-inflammatory microenvironment that induces the release ofother factors such as growth factors, cytokines, and prostaglandins.This, in turn, enhances the stimulation and propagation of fibroblasts,and leads to an increase in the release of pro-inflammatory molecules.Fibroblasts also play a role in exuding extracellular matrix. Notably,in asthmatics, the basement membrane is thicker than in normalindividuals due to the abnormal repair of the bronchial epithelium byfibroblasts. Further, myofibroblasts are also in abundance in asthmaticindividuals, due in part to pro-inflammatory stimulation.

The Gene 454 SNPs (Table 10; FIGS. 7A-7H, and 10) identified by theexperiments described herein result in nucleotide changes that maydisrupt the intracellular function, stability, splicing, or expressionof the encoded protein. It is possible that the nucleotide changes causean increase or decrease in the normal activities or levels the P2X7receptor, thereby affecting the pro-inflammatory response triggered byATP, and resulting in asthmatic symptoms. The sum of these dataindicates that Gene 454 (P2X7) is involved in the pathophysiology ofrespiratory disorders, including asthma.

2. Functional Role of Gene 561 in Asthma and Related Diseases

Gene 561 is the human ortholog of rat RIMBP2, a scaffold protein. RIMBP2protein binds to RIM, a putative effector of Rab3, and appears torecruit synaptic vesicles by a tethering reaction. RIMBP2 is anintracellular protein that contains an SH3 domain, which is thought tobe involved in binding to RIM. RIMBP2 also contains fibronectin type IIIrepeats, which are rarely observed in intracellular proteins (Wang etal., 2000, JBC 275:20033-2044).

The nucleotide sequence of Gene 561.1 alternate splice variant (alsoreferred to as 561.nt1), corresponds to SEQ ID NO:31 (FIGS. 27A-27K),and the encoded amino acid sequence (also referred to as Gene 561.aa1)corresponds to SEQ ID NO:120. The nucleotide sequence of Gene 561.2alternate splice variant (also referred to as 561.nt2) corresponds toSEQ ID NO:32 (FIGS. 28A-28C), and the encoded amino acid sequence (alsoreferred to as Gene 561.aa2) corresponds to SEQ ID NO:121. As determinedby the experiments described herein, the transcript size of Gene561.nt1and Gene561.nt2 is 7.9 and 6.1 Kb, respectively (FIG. 8). An alternativesplice site has been identified at a position in the 3′UTR, betweenexons J and I (FIG. 8). RT-PCR data indicate that Gene 561.nt1 isclearly expressed in lung at low levels, but Gene561.nt2 is not (FIG.8). The genomic structure of Gene 561 comprises 21 exons and spans ˜200Kb.

ATP has been shown to stimulate vagal afferent nerve terminals in thelung. This can lead to local axon and central vagal reflexes, which areknown to play a major role in neurogenic inflammation andbronchoconstriction. Nocturnal asthma characterized by acutebronchoconstriction in the morning has been associated with plateletactivation, which releases large amounts of ATP, and augmentation ofvagal tone (Schulman et al., 1999, Am. J. Respir. Cell Mol. Biol.20:530-537). It is possible that Gene 561 recruits synaptic vesicles forneurotransmitter release at the afferent nerve terminals in lung. This,in turn, may be important for bronchoconstriction/dilation. Accordingly,the Gene 561 SNPs that show association with asthma (Table 10, FIGS.27A-27K, and FIGS. 28A-28C) may disrupt the function, stability, orexpression of the encoded protein. The altered Gene 561 protein maycause an increase or decrease of neurotransmitter, resulting inaugmentation of the vagal tone, and leading to bronchoconstriction. Thesum of these data indicates that Gene 561 is involved in thepathophysiology of respiratory disorders including asthma.

3. Functional Role of Gene 757 in Asthma and Related Diseases

Immunocytochemical studies have shown that both TGF-β (transforminggrowth factor β) and EGFR1 (epidermal growth factor receptor) are highlyexpressed in areas of bronchial epithelial injury, and that theseparallel pathways operate to repair epithelial cells (Puddicombe et al.,2000, FASEB J. 14:1362-1374). EGFR1 stimulates epithelial repair, whileTGF-β regulates the production of profibrogenic growth factors andproinflammatory cytokines leading to extracellular matrix synthesis.TGF-β also acts in the WNT signaling pathway, which functions in avariety developmental processes, including cell differentiation, cellpolarity, cell migration, and cell proliferation (Calvo et al., 2000,PNAS 97:12776-12781). The WNT components activate the frizzledreceptors, which stabilize β-catenin. This, in turn, activates theexpression of target genes in the nucleus (Kühl et al., 2000, TIG16:279-283).

Gene 757 is frizzled 10 (FZD10), a putative receptor for Wnt-7a(Kawakami et al., 2000, Develop. Growth Differ. 42:561-569). The nucleicacid sequence of Gene 757 corresponds to SEQ ID NO: 90, and the encodedamino acid sequence corresponds to SEQ ID NO: 153 (FIGS. 9A-9F). Asdetermined by the experiments described herein, Gene 757 is expressed inbrain, heart, skeletal muscle, colon, thymus, spleen, kidney, smallintestine, placenta, and lung (FIG. 6). The transcript size of Gene 757is 3.6 Kb, of which 3253 bp have been identified (FIG. 6). Thetranscript is contiguous with genomic DNA, indicating that Gene 757 isan intronless gene. The Gene 757 ORF is 1746 bp long and encodes a 581amino acid protein. The 3′ untranslated region is 1052 bp long, and 456bp of the 5′ UTR has been sequenced.

The FZD10 protein is a receptor composed of a seven-transmembrane repeatwith an N-terminal cysteine-rich domain and a C-terminal Ser/Thr-XXX-Valmotif. FZD10 shares 65.7% overall amino acid identity with FZD9 (Koikeet al., 1999, Biochem. Biophys. Res. Commun. 262:39-43). Frizzled 10 isa cell surface receptor for the secreted glycoprotein Wnt-7a. Inaccordance with one model of respiratory diseases, the WNT signalinggene acts in concert with the frizzled 10 receptor to trigger a signaltransduction pathway leading to the activation of genes involved inbronchial epithelial repair. Thus, Gene 757 SNPs that are associatedwith the asthma phenotype (Table 10 and FIGS. 9A-9F) may alter thesignal transduction pathway, causing either the over- or underexpressionof genes involved in bronchial epithelium repair. This alteration, inturn, may result in the activation of the epithelial-mesenchymal trophicunit in the lung, placing the bronchial epithelium in a “state ofrepair” mode, and leading to airway remodeling (Holgate et al., 1999,Clin. Exp. Allergy. Suppl 2:90-95). The sum of these data indicate thatGene 757 (FZD10) is directly involved in the pathophysiology ofrespiratory disorders including asthma.

Example 15 Protein Expression and Purification

Expression and purification of the chromosome 12q23-qter proteins of theinvention can be performed essentially as follows. Nucleotide sequences(e.g., one or more of SEQ ID NO:1 to SEQ ID NO:92 and SEQ ID NO:156 toSEQ ID NO:4684) are prepared by polymerase chain reaction (PCR).Synthetic oligonucleotide primers specific for the 5′ and 3′ ends of thenucleotide sequences are designed and purchased from Life Technologies(Gaithersburg, Md.). All forward primers (specific for the 5′ end of thesequence) are designed to include an NcoI cloning site at the 5′terminus. These primers are designed to permit initiation of proteintranslation at the methionine residue encoded within the NcoI sitefollowed by a valine residue and the protein encoded by the nucleotidesequence. All reverse primers (specific for the 3′ end of the sequence)include an EcoRI site at the 5′ terminus to permit cloning of thesequence into the reading frame of the pET-28b expression vector(Novagen). The pET-28b vector provides a sequence encoding an additional20 carboxyl-terminal amino acids including six histidine residues, whichcomprise the His-Tap affinity tag.

Genomic DNA prepared from the 12q23-qter including the BAC sequencesincluding RPCI-11_(—)0899A17, RPCI-11_(—)0666B20, RPCI-11_(—)0723P10,RPCI-11_(—)0831E18, RPCI-11_(—)0932D22 and RPCI-11_(—)0702C13 (SEQ IDNO:719 to SEQ ID NO:978; Table 3A) and BAC end sequence (SEQ ID NO:156to SEQ ID NO:693) region is used as the template for PCR amplification(Ausubel et al, 1994). For PCR amplification, cDNA (50 ng) is introducedinto a reaction vial containing 2 mM MgCl₂, 1 μM synthetic primers(forward and reverse primers complementary to and flanking a defined12q23-qter region), 0.2 mM of each of dNTP (dATP, dGTP, dCTP, and dTTP),and 2.5 U heat stable DNA polymerase (Amplitaq, Roche Molecular Systems,Inc., Branchburg, N.J.) in a final volume of 100 μl.

Upon completion of thermal cycling reactions, each sample of amplifiedDNA is purified using the Qiaquick Spin PCR purification kit (QIAGEN,Gaithersburg, Md.). PCR products are subjected to digestion with therestriction endonucleases, e.g., NcoI and EcoRI (New England BioLabs,Beverly, Mass.) (Ausubel et al, 1994). The digested DNA is subjected toelectrophoresis on 1.0% NuSeive (FMC BioProducts, Rockland, Me.) agarosegels. The gel is incubated with ethidium bromide, and the digested DNAis visualized with long-wave UV irradiation. The DNA fragments areisolated from the agarose gel, and are purified using the GeneClean Kitprotocol (BIO 101, Vista, Calif.).

The pET-28b vector is prepared for cloning by digestion with restrictionendonucleases, e.g., NcoI and EcoRI (New England BioLabs, Beverly,Mass.) (Ausubel et al, 1994). The digested pET-28b expression vector isligated to the gel-isolated DNA fragments (Ausubel et al., 1994). Theligated product is used to transform E. coli (e.g., BL21) (Ausubel etal, 1994) as follows. Briefly, 1 μl of ligation reaction is mixed with50 μl of electrocompetent BL21 cells, and the cells are subjected to ahigh voltage pulse. Following this, cells are incubated in 0.45 ml SOCmedium (0.5% yeast extract, 2.0% tryptone, 10 mM NaCl, 2.5 mM KCl, 10 mMMgCl₂, 10 mM MgSO₄, and 20 mM glucose) at 37° C. with shaking for 1 hr.Cells are then spread on LB agar plates containing 25 μg/ml kanamycinsulfate, and grown overnight. Transformant BL21 colonies are thenisolated and analyzed to evaluate cloned inserts, as described below.

Individual BL21 transformant colonies are analyzed by PCR amplification.The PCR reaction uses the same forward and reverse primers specific forthe 12q23-qter region sequences that are used in the cloning step.Successful amplification verifies the ligation of the sequence in theexpression vector (Ausubel et al., 1994). Individual BL21 coloniescontaining pET-28b vectors with 12q23-qter region nucleotide sequencesare inoculated into 5 ml of LB broth plus 25 μg/ml kanamycin sulfate,and grown overnight. The following day, plasmid DNA is isolated andpurified using the QIAGEN plasmid purification protocol (QIAGEN Inc.,Chatsworth, Calif.).

The pET vector can be propagated in any E. coli K-12 strain, e.g.,HMS174, HB101, JM109, DH5, and the like, for purposes of cloning orplasmid preparation. Hosts for expression include E. coli strainscontaining a chromosomal copy of the gene for T7 RNA polymerase. Thesehosts are lysogens of bacteriophage DE3, a lambda derivative thatcarries the lacI gene, the lacUV5 promoter, and the gene for T7 RNApolymerase. T7 RNA polymerase is induced by addition ofisopropyl-β-D-thiogalactoside (IPTG), and the T7 RNA polymerasetranscribes any target plasmid containing a functional T7 promoter, suchas pET-28b, carrying its gene of interest. Strains include, for example,BL21(DE3) (Studier et al., 1990, Meth. Enzymol., 185:60-89).

To express the recombinant sequence, 50 ng of plasmid DNA are isolatedas described above to transform competent BL21(DE3) bacteria asdescribed above (provided by Novagen as part of the pET expression kit).The lacZ gene (β-galactosidase) is expressed in the pET-System asdescribed for the 12q23-qter region recombinant constructions.Transformed cells are grown in SOC medium for 1 hr, and then plated onLB plates containing 25 μg/ml kanamycin sulfate. The following day, thecolonies are pooled and grown in LB medium containing kanamycin sulfate(25 μg/ml) to an optical density at 600 nM of 0.5 to 1.0 OD units. Atthat point, 1 mM IPTG is added to the culture for 3 hr to induce geneexpression of the 12q23-qter sequences.

After induction of gene expression with IPTG, cells are collected bycentrifugation in a Sorvall RC-3B centrifuge at 3500×g for 15 min at 4°C. Pellets are resuspended in 50 ml of cold mM Tris-HCl, pH 8.0, 0.1 MNaCl, and 0.1 mM EDTA (STE buffer). Cells are then centrifuged at 2000×gfor 20 minutes at 4° C. Wet pellets are weighed and frozen at −80° C.until ready for protein purification.

The disclosure of each of the patents, patent applications, andpublications cited in the specification is hereby incorporated byreference herein in its entirety.

Although the invention has been set forth in detail, one skilled in theart will recognize that numerous changes and modifications can be made,and that such changes and modifications may be made without departingfrom the spirit and scope of the invention.

What is claimed is:
 1. An isolated antibody or epitope-binding fragmentthereof which specifically binds to a polypeptide comprising the aminoacid sequence set forth in SEQ ID NO: 120 wherein the specific bindingis to the SEQ ID NO: 120 portion of the polypeptide.
 2. The isolatedantibody or epitope-binding fragment of claim 1, which binds to animmunogenic component comprising at least 30 consecutive amino acidresidues of SEQ ID NO:
 120. 3. The isolated antibody or epitope-bindingfragment of claim 1, which binds to an immunogenic component comprisingat least 50 consecutive amino acid residues of SEQ ID NO:
 120. 4. Theisolated antibody or epitope-binding fragment of claim 1, which binds toan immunogenic component comprising at least 100 consecutive amino acidresidues of SEQ ID NO:
 120. 5. The isolated antibody or epitope-bindingfragment of claim 1 which binds to a polypeptide having an amino acidsequence of at least 200 consecutive residues of SEQ ID NO: 120 or animmunogenic component thereof.
 6. The isolated antibody orepitope-binding fragment of claim 1, which is immobilized on a solidsupport.
 7. The isolated antibody or epitope-binding fragment of claim1, which binds to a polypeptide having an amino acid sequence of SEQ IDNO:
 120. 8. The isolated antibody or epitope-binding fragment of claim 1which is a monoclonal antibody.
 9. The isolated antibody orepitope-binding fragment of claim 1 which is a polyclonal antibody. 10.The isolated antibody or epitope-binding fragment of claim 1 which is arecombinant antibody.
 11. The isolated antibody or epitope-bindingfragment of claim 1 which is a chimeric antibody.
 12. The isolatedantibody or epitope-binding fragment of claim 1 which is a humanizedantibody.
 13. The isolated antibody or epitope-binding fragment of claim1 bound to said polypeptide.
 14. A composition comprising the isolatedantibody or epitope-binding fragment of claim 1 and a pharmaceuticallyacceptable carrier.
 15. An isolated antibody or epitope-binding fragmentthereof which specifically binds to a polypeptide comprising the anamino acid sequence encoded by 50 or more consecutive nucleotides of SEQID NO: 31 wherein the specific binding is to the amino acid sequenceencoded by 50 or more consecutive nucleotides of SEQ ID NO: 31 portionof the polypeptide.
 16. The isolated antibody or epitope-bindingfragment of claim 15, which binds to an immunogenic component comprisingat least 30 amino acid residues encoded by consecutive nucleotides ofSEQ ID NO:
 31. 17. The isolated antibody or epitope-binding fragment ofclaim 15, which binds to an immunogenic component comprising at least 50amino acid residues encoded by consecutive nucleotides of SEQ ID NO: 31.18. The isolated antibody or epitope-binding fragment of claim 15, whichbinds to an immunogenic component comprising at least 100 amino acidresidues encoded by consecutive nucleotides of SEQ ID NO:
 31. 19. Theisolated antibody or epitope-binding fragment of claim 15, which isimmobilized on a solid support.
 20. The isolated antibody orepitope-binding fragment of claim 15 which is a monoclonal antibody. 21.The isolated antibody or epitope-binding fragment of claim 15 which is apolyclonal antibody.
 22. The isolated antibody or epitope-bindingfragment of claim 15 which is a recombinant antibody.
 23. The isolatedantibody or epitope-binding fragment of claim 15 which is a chimericantibody.
 24. The isolated antibody or epitope-binding fragment of claim15 which is a humanized antibody.
 25. The isolated antibody orepitope-binding fragment of claim 15 bound to said polypeptide.
 26. Acomposition comprising the isolated antibody or epitope-binding fragmentof claim 15 and a pharmaceutically acceptable carrier.